

BIOMETRIKA 


A JOURNAL FOR THE STATISTICAL STUDY OF 
BIOLOGICAL PROBLEMS 


FOUNDED BY 

W. F. R. WELDON, FRANCIS GALTON and KARL PEARSON 


EDITED BY 

EGON S. PEARSON 


IN CONSULTATION WITH 


HARALD CRAMLR 
R. C. GEARY 
MAJOR GREENWOOD 


J. B. S. HALDANE 
Q. M, MOEANT 
JOHN WISHART 


VOLUME XXXIV 

1947 


ISSUED BY THE BIOMETRIKA OPPIOE 
UNIVERSITY COLLEGE, LONDON 
AND PRINTED AT THE 
UNIVERSITY PRESS, CAMBRIDGE 




CONTENTS OF VOLUME XXXIV 

Memoirs 

1 . The variance of the overlap of geometrical figures with reference 
to a bombing problem. By F. Gaewood. With two Figures in 
the Text 

II, A study of a first dynasty series of Egyptian skulls from Askkara 
and of an eleventh dynasty series from Thebes. By A. Bateawi 
and G. M. Moeant. With one Figure in the Text and one 
Folding Chart 

III. The generalization of ‘Student’s’ problem when several different 

population variances are involved. By B. L. Welch. 

IV. The distribution of KendaU’s r coefficient of rank correlation in 

rankings containing ties. By G. P. Sillitto. With one Folding 
Chart 

V. The use of range in place of standard deviation in the i-test. By 
E. Loed 

VI. The frequency distribution of ^b-i for samples of all sizes drawn at 
random from a normal population. By R. C. Gbaey. With eight 
Figures in the Text 

VII. On the computation of universal moments of tests of statistical 
normality derived from samples drawn at random from a normal 
universe. Application to the calculation of the seventh moment 
of 63. By R. C. Geaey and J. P. G. WoELLEDGE 

VIII. The asymptotical distribution of range in samples from a normal 
population. By G. Eleving. With two Figures in the Text 

IX. Limits of the ratio of mean range to standard deviation. By R. L. 
Plaokett 

X. Significance tests for 2 x 2 tables. By G. A. Baenaed. With two 
Figures in the Text . 

XI. The choice of statistical tests illustrated on the interpretation of 
data classed in a 2 x 2 table. By E. S. Peabson 

XII. 2x2 tables. A note on E. S. Pearson’s paper. By G. A. Baenaed 

XIII. The cumulants of the z and of the logarithmic and t distributions. 

By J. WiSHAET 

XIV. The meaning of a significance level. By G. A. Baenaed 

XV. On the distribution of the rank correlation coefficient r when the 
variates are not independent. By W. Hoefding 

XVI. The significance of rank correlations where parental correlation 
exists. By H. E. Daniels and M. G. Kendall 

XVII. Testing for normality. By B. C. Geary 


PAGES 

1—17 

18—27 

28—35 

36—40 

41—67 

68—97 

98—110 

111—119 

120—122 

123—138 

139—167 

168—169 

170—178 

179—182 

183—196 

197—208 

209—242 



VI 


Contents 


XVIII. The stratified semi-stationary xJOpulation. By S. Vajda 

XIX. A simple approach to confounding and fractional replication in 
factorial experiments. By 0. Kempthorne .... 

XX A comparison of stratified with unrestricted random sampling from 
a finite population. By P. Armitagb. With four Figures in the 

Text 

XXL Some theorems on time series. I. By P. A. P. Moran . 

XXII. Rank correlation between two variables, one of which is ranked, the 
other dichotomous. By J. W. Whitfield .... 

XXllI. The variance of t when both rankings contain ties. By M. G. 

liENDALL 

XXIV. A ‘ smooth ’ test for goodness of fit. By P. N. David , 

XXV. An exact test for the equality of variances. By R. L. Plack ett . 

XXVI. The estimation from individual records of the relationship between 
dose and quantal response. By D. J. Finney. With four Figures 
in the Text 

XXVII. A power function for tests of randomness in a sequence of alter- 
natives. By F. N. David. With one Figure in the Text . 

XXVIII. A numerical solution of the problem of moments. By H. 0. Hartley 
and S. H. Khamis 

XXIX. Approximation to percentage points of the z-distribution. By A. H. 

Carter 


Miscellanea 


(i) Note on the cumulants of Fisher’s x-distribution. By L. A. Aroian . 

(ii) A note on the mean deviation from the median, By K. R. Nair. 

(iii) On the method of paired comparisons. By P. A. P. Moran. 

(iv) Notes on the calculation of autocorrelations of linear autoregressive 

schemes. By M. H. Quenouille 

(v) Approximate formulae for the percentage points of the incomplete beta 

function and of the distribution. By D. Halton Thomson 

(vi) Review of C. E. Weatherburn’s A First Course in Mathematical 

Statistics. By F, N. David 

(vii) Review of Advances in Genetics, Vol. 1, Academic Press, New York Bv 

J. B, S, Haldane ' ^ 

(viii) Review of H. Cramer’s Mathematical Metlwds of Statistics Bv F N 
David ... j ■ • 


baqes 

243—254 

255—272 

273—280 

281—291 

292—296 

297—293 
299—310 
311— 319 

320—334 

335—339 

340—351 

35 - 2—358 

pages 

359 — 360 

360 — 362 
303—366 

306—367 

368—372 

373 

373 

374 



VoL XXXIV. Parts I and II 


Jan%iry, 1947 





A JOURNAL FOR THE STATISTICAL STUDY OF 
BIOLOGICAL PROBLEMS 

FOUNBEn BY 

w. F, R. WELDON, FRANCIS GALTON and KARL PEARSON 

EDITED BY 

EGON S. PEARSON 

IN CONSULTATION WITH 

HARALD CRAMER J. B. S. HALDANE 

R. C, GEARY G. M. MORANT 

MAJOR GREENWOOD JOHN WISHART 


ISSUED BY THE BIOMETRXKA OFFICE 
UNIVERSITY COLLEGE, LONDON 

AND PRINTED AT THE 
UNIVERSITY PRESS, CAMBRIDGE 


Seprinted by offset-litho 1953 


\Iasued 11 Fabruaty 1947J 



Volume XXXtV, Paht.s 1 and 11 


.1ai\(aaii\' ]!)47 


THE VARI AN( ! E 0 F T H 1^ ( ) V E H LAP 
WlTli HEFEHENEE TO A 


OF (lEOMF/rHKJAL 
BOMlHN(j PROBLEM 


FIOUHES 


liY .1<\ BARVVOOH, Pii.D. 


1. Introduction 

The present paper deals with a particular prohleiii arising in the inal.heinatical study of 
bombing. Brieily, the general problem is that of predicting the over-fill effects of a bombing 
attack carried out under given conditions against a given target, and the nuithomfitical 
treatment involves various simplifying assum{)tions coiuierning these conditions. 

In the type of problem considered here., iitt.c.ntion is centred on the total phin area of 
damage cansod to a .single building by bombs hdliug indepeiidefitly and at random over a 
larger area containing the bnilding. It is fissumed t.lnit eaifh fnnnb damages all tluit part of 
the building contidned within fi (firele (.>f fixed size c.eid.red at the bomb (a square damage 
area is also considered), whihi the. building luis fi simple plan outlim;, such us a ri^otanglo or 
a circle. The arcfi of d.'image of t.wo or more adj.-nfcnt bnmb.s is imu'cly the .ircfi em'ored by 
the circles. The tluHirctictd problmns <lcfi.lt with ;i.rc ( hose oi' estiimiting (.he variance of the. 
amount of da!nag(!d ar(si (tlu' estiinfil.ion of the mean or exjuuded dfimage presents little 
diffioulty). It wouifl b(' more sfd isbiefory (o obtain the. eomphffe frefiueney distrilnif.ions, 
but this has .so i'ar not been (uddf'ved, nor luis it- be.en po.ssible to iddfiin cs.vplicil. formuhu^ for 
the 3rd and 4th moments. 

As there may lie fippliefitions of the problems to fields eompletfdy different from tho.se of 
bombing studies, and fis they iir(' (irobkuns which involve e.sseid.ially die concciffcs of geometry 
and of probability, it is eonvimient t.f) express tbe.m ent irely in t lu'se terms. 

We thus have prolilems of t.lie following 1.yp<‘. A mmilxu' of (ureles are |)lfU’.ed fit randoin on 
a plane so that each one lifis some, or idl of it.s firmi inside fi fixed sqmire.. What firi; t.liif nuuin 
and variance of tlie area of the square covered by t.lie eireles 1 'I'lic fundamentfils of this tyjje 
of problem have been studied by Itobbins (ItM-l). and Kronowski A Nc!yman have 
dealt with another particular ease.* Itobbins's results imablo us to deal with g(M)metrica.l 
figures other than circles and scpiares, and also t.o deal with cases wlujre the number, 
position and orientation of the ‘cfivering’ ligure.s folloa' [u-obability law.s other than the 
simple ones implied in the. iibove (example. 


2. lloinilNs’s TIIKORR.M 

In leading ny) to his thcorcmi, R()l)biiiM uses (.he. eonc(^p(, of a random imaisurable. subsi't,. X 
of n-dimensional Ithuilidean spac;e A',,. detim^s t.lu' Cmud ion f/(.r, X) for every point. :r 
of j®,, and for every A' as equal (,o I for :re A' and zero else.wliere. 'i’his theorem i.s t.lien as 
follows: 

Let X be a random Lc.bcsijuc mcamirahU'. subncl of E„, with nmisure. //.(A). For any point 
xofEJetpix) = Pr(xeX). Then, assiimimj that the. fwnction (j{x, X) is a measii.rable f iinr.lion 
of the pair (x, X), the expected itahte of the measure of A' will be given by (he Lebe.sguc integral of 
the function p(x) over E„. 

* Note, by Eili tor. 3’lii.s pujair wn.s received for- [luldicat.iou in .Sepleinber 1!)45; Dr (lurwood lia.s 
a.skGd mo to ndd tlie follriwin^ noti' in proof. •’Tlie autlior Imd the jjrivileirt' of .seeinu; llie worlc of 
Bronoivski & Neymnii in i)r-oof. d’liis paper was then submitted, uflrT wliie.li tiieir work wu.s publisbod 
(1945) togotVier with a second iirti(4e try Itolrlriiis (1945), wlio has solved, iuuon(j; otbers, some of the 
pioblems dealt witli in tliis papin-, us acUiiuwle(l|j:od in later footnote.s.” 

liiometrika 3.) ; 



2 The variance of the overlap of geometrical figures 

Robbins generalizes this result to obtain the mth moment of the measure of X ; this is the 
integral of the function p{xi, x„^) over where 

x^, = Pr(xi e Z and € Z . , . and a;„ e Z). ( 1 ) 

It is useful to give a simple non-rigorous proof of this result. Snirpose the space to be 
divided into an enumerably infinite set of small elements Wi, .... If we assume that any 
particular subset Z can be made up of a selection of the w’s, then 

/t(Z) = A2 <i>2-1' ..., 

where cJi is here used also as the measure of the element oj^, and where the A’s, appropriate 
to this particular Z, are 0 or 1. Hence 

{/i(Z)}«* = SS.-'ApAg...w^a)g..., 

V a 

where each summation is over the whole of E^, and there are m such summations. Thus 
exp {/<(Z)}'« = S S • • . • exp (Aj, Ag . . . ) . 


p a 


. . are in Z, and 


But the expectation of A^A^ ... is the probability that the elements Wj,,Wg, 
on proceeding to the limit the desired result is obtained. 

The verification of this result in the case, say, of the 2nd moment of a linearly distributed 
variate, is instructive. Thus suppose a: is a variate with a probability function F{x), i.e. the 
probability of obtaining a value < * is given by the measurable function E{x), where 

Z(-oo) = 0 and F(co) => 1. 

Define Z as the interval from 0 to a; then the expectation of the square of the measure of 
X is the 2nd moment of x. To use Robbins’s theorem we used 
the probability f{x.^,x^ that a given pair of values and Kj 
both lie in the interval 0, x chosen at random. Using co- 
ordinate axes Ox-i, Ox^, this probability is zero in the 2nd and 
4th quadrants, since 0, x cannot contain two points 0 :^ and x^ 
of opposite signs. In the region A (see Pig. 1 ), where > ajj > 0, 
the two values x-^ and x^ are both in 0,x if x>Xi, and the 
probability of this is 1 - Fix^). Thus in A, 

p{x^,x^) = l~F{xj). 

Similarly in B p(xj, = 1 - F(Xi), 
while in 0 == F{x^) 

and in D p(x^,x,) = F(x^). 

The integral ofp{x^, x^) over A is seen to be 

0 

— F (Kx)] daij, 



B 

y^ A 

c 

y' D 

0 ’S 


Fig. 1. 


re rs 


while the total integral of p[x-^^, x^) over the whole plane 

2jo»:[l-P(a:)]da:-2 j” xF{x)dx. 
A single integration by parts then leads to 

I z^dF{x), 

J —CO 

which is the 2nd moment of x, as required. 




F. Garwood 


3 


3. Application op Robbins’s theorem to overlap problems 
We shall be concerned with cases in where the subset X is the part of a fixed area A in the 
plane which is covered by a number of areas C dropped independently and at random on the 
plane. We suppose A to be the interior of a simple closed curve, while each 0 is the interior 
of another curve. The area O has a reference point Q (conveniently called its centre) and a 
reference line, and it is assumed that there is a frequency distribution (p(x,y,6) of the 
position (x, y) of Q and of the inclination 6 of the reference line to a fixed direction in A'g. 
<j){x, y, 6) can be assumed to be zero outside an area T, i.e. the points Q are distributed inside 
T. (In the applications the angle 6 will be constant and the areas 0 will be equally likely to 
fall anywhere over T, so that we can write ^{x, y, 6) = IjT.) Another chance variable is k, 
the number of areas C; its distribution can be defined by the series Po>Pi>i’ 2 > ••• 

(which are the probabilities of 0, 1 , 2 , ... O’ a falling on T), or by the probability 

generating function 0{u), where 

G{u) spo + + ...+p^u’‘+ .... (2) 

Finally, it is more convenient to consider the moments of the area Y ~ A— X, i.e. the area 
of A (we can use the symbols Y, etc. for either the sets or their areas) not covered by the G'a. 
Evidently the variances of X and Y are equal. 

To obtain the 1st moment of Y, we need first the probability pix^, y^) that a point [x^, y^) 
of A will belong to 7, i.e. of {x^, y^) not being covered by a 0. Now (Xy, y^) will not be covered 
by a particular C falling at an inclination 6 if the centre Q[x, y) falls outside an area 
G{Xy,yy, 6) obtained by centring the G at {Xy,yy) and rotating it through 180°. If the part 
of T exterior to this area is called T ~ G(xy, y^, 0), and if we allow all inclinations, the prob- 
ability of this occurring is 

J/i) = I 1 \< j ){ x , y , e ) d , xdydd . (3) 

2 ’- 5 ( x ,, VuO) 

If k G’a are dropped independently, the probability is given by q’‘{Xy, y^), so that the total 
probability of {Xy, yy) belonging to 7 is 


fc =0 


The 1st moment of 7 is thus, in the case of k G’a, 

K(F) = J jg.Hxx>yi)dXydyy 


and in the general case 


/ii(F) = J jG{q{xi,yi))dxydyy. 


(4) 


(5) 


( 6 ) 


The 2nd moment is obtained by a similar process; we require the jirobability p{Xy, yy, X 2 , y^ 
that neither of two points {Xy, yy) and {x^, y^) is covered by a C. Corresponding to these two 
points and an inchnation 0 the permissible region in which each centre Q can fall is 

T - d{Xy, yy, 0) - C[X2, y^, 0)^T-Cy- C^. 

Thus for one C the probability is 

q{xy, yy, X 2 , a/a) = J 2/> dxdydO, (7) 

giving in the case of k G’a, 5r-C,~c, 

p{Xy, yy, X 2 , 2 / 2 ) = q^Xy, yy, X 2 , y^ and p{xy, yy, X 2 , ya) = 0{q{xy, yy, x^, y^)) 



^ variance of the overlap of geometrical figures 

in the general case, Thus the 2nd moment 

^'(7) = J IJ for k C’s 

= { f [ . f y^' dx-^dy^dx^dy^ 


( 8 ) 

and- JJ^J (9) 

in the general case. In general, for the mth moment, the probability that [x^, yf)... [x^, yj 
are not covered by k G^s is 

2 /2> 

where sK, i/x, 2/a, ..., i«„,2/J = 0 jf{x,y,d)dxdyd9, (10) 

and 'T—0i—G^... — C„i is the area of T outside G’a centred at {Xi,yi), (* 8 , 2 / 2 ), •••, 
and rotated through 180°. 

Thus the with moment is equal to 


/i;(y) = j ... j 3'‘(*i,2/i, *2,2/2, •••> *m,2/m)f^*i«^2/i‘^*a%2 dx^^Vm 

A A 

or 

j j... J* |%(*i, 2 /i, * 2 - 2 / 2 , -, x^,ym)}dxidy^dx^dy^...dx^dy 


A ' A 

in the case of k G’s, or 


y (11) 


in the general case. 


4. Uniform distribution of oofbeing areas at oonstajstt inclination 

As mentioned above, in the cases with which we shall be dealing, the areas Q are equally 
likely to fall anywhere over T, and the angle 6 is constant. The function ^{x, y, 6) can be put 
equal to IjT for points of T and zero outside; the variable 6, and integration with respect 
to it, may be omitted. 

The function q{xi, y^) is the fraction of T not covered by a G centred at (a:i, y^) and rotated 
through 180°, and in general 3(*i,2/i, * 2 , 2 / 2 , -••, fraction of T outside m G’s 

centred at (* 1 , 2 / 1 ), (* 2 , 2 / 2 ), •••, i^m^ym) rotated through 180°. 

Instead of the variate Y we can consider YjA, i.e. the fraction of A not covered by k C’s, 
and to obtainits with moment we divide Y) hy A'"-. Also the quantity dx^dy^ . . . dx^ dy^/A''‘ 
is the probability of obtaining wi centres in the elements of area dx-^^dy^, ...,dXj^dy,n of A 
if these centres are uniformly distributed over 4- 

We thus obtain the following result from (11): the mth ynoment of the. fraction of A not 
covered by k C’s with their centres falling at random on T is equal to the kth moment of the 
fraction of T not covered by m G’s with their centres falling at random on A and rotated through 
180°. 

In the case k = m = 1 we can express this in a slightly different way if we (i) deal with the 
area common to the two areas concerned, (ii) regard all orientations as possible and as 
equally likely, and (iii) deal with areas rather than fractions. We obtain, in fact, the following 
result: the integral of the overlap of G and A, when the centre of 0 is taken over T and all 
orientations are permitted, is equal to the corresponding integral of the overlap of'C and T, for 
all positions of the centre of G on A and for all orientations. 

In the practical cases with which we shall deal, the area 4 is always ‘well inside ’ T, i.e. 
every point of 4 can be reached by a G centred somewhere in T. In such cases the formula 



F. Gaewood 


for the mean overlap is simple; we have m = 1 and the fraction of T not covered by one 0 
is {T- G)IT, which is constant for all so that its ibth moment is {T-G)^jT’‘, i.e. 


yiiYIA) = 


( 12 ) 


If the number of C’s follows a probability generating function Q{u), the mean is given by 




(13) 


For the 2nd moment we are concerned with two G’e centred at and {X 2 ,,y-^, and if 

their common area is Q{xy^, y^^, x^,, y^), we have 




_ y-2C+i3(a:i, y^, x^,y^) 


T 


(14) 


The 2nd moment y'^iYjA) is then the expectation of or G{q) for all pairs of points over A, 
and we no longer have a simple formula as in the case of the 1st moment. The overlap Q, 
however, depends on the relative positions of the two C’s, and therefore the number of 
variables in the integration is reduced from 4 to 2 or 1. This is illustrated in the following 
examples. 

6. Circles falling on a fixed square 

Assume A to be a square of unit side (i.e. ^4 = 1), <7 a circle radius a and T a ‘square with 
rounded corners’, whose boundary is at a distance a outside the sides of A. Thus 

T = 1 +4a + 7ra® (15) 

and G = na^. (16) 

It is seen that the fraction qof T outside the two circles centres {Xi, y-y) and (ajj, y^) is a func- 
tion only of the distance r between these points, and can therefore be written as q{r). Hence 
if <j>{r) is the frequency function of r, we obtain 


or 


q’‘(r)^{r)dr, 

J 0 

/4(F) = G{q(r)}^(r)clr. 

J 0 


The area I?(r) common to two -circles radii a with centres distant r apart is 

jQ(r) = 2a®((9 — sin6* cos.0), 
where r = 2acosd (r<2ft), 

and {J(r) - 0 (r^'2a), 

1 -t-4a — 7ra‘‘+I3(r) 


and 


q{r) 


(IT) 

(18) 

(19) 

( 20 ) 
( 21 ) 

( 22 ) 


1 -f 4a -I- na ^ ' 

where I3(r) is given by (19), (20) and (21). 

To obtain the frequency function (j){r) of r, we note that 

r® = (xy-x^j^ + iyy-yz)^ (23) 

where Xy, x^, yy and y.y are uniformly and independently distributed in the range 0, 1. The 
difference £ = ] a:i — 0:2 | follows the ‘triangular’ distribution, 

^ ^ d/ = 2(I-g)d^, (24) 

irom g = 0 to 1, so that the quantity 

u = ^^={Xy-X^^ 

i^du. 

y/U 


df 


follows the distribution 


(26) 



6 The variance of the overlap of geometrical figures 

Similarly, v = [yi—y^f follows independently the same distribution 

„ l — Jv, 

df^~^^dv. (26) 

The distribution of r - ^j(u + v) 

is obtained by integrating the product 

V(«u) 

of the frequencies of u and « over that part of the line u + u = r® within the square of unit 
side in which the point u, v can lie, and we obtain without much- difficulty for the frequency 
function of r, 

^(r) = 2r(7r— 4r+r*) for 0<r<l, (28) 

and 9i(r) = 2r(4sin~^ l/r + 4,y(r^— 1)— r^-TT — 2) for l<r<f'2. (29) 

Thus the 2nd moment of the fraction of the unit square uncovered is given by the integral 
(17) or (18), where q{r) is given by (22), (19), (20) and (21) and0(r) is given by (28) and (29). 

It does not appear possible to reduce the integral (17) simply to elementary functions, and 
quadrature must be used. The integrand has discontinuities in its first derivative at -r = 1 
and r - 2a, so that the integration must be carried out separately over the intervals with 
these as end-points. 


6. OVBRLAr 03? OIRCLKS OH I-IXED RECTANonB* 


We replace the square A of the previous section by arcctangle A ; for convenience we assume 
its sides to be fb and Ijfb, whore 6 > 1, so that the ratio of tlie longer to the shorter side is 
6 and the area is unity. The centres of the circles radii a are assumed to be equally likely to 
fall anywhere in a ‘rectangle with rounded corners’ T, whose boundary is at a distance a 
outside A, i.e. 

T = l + Tra^+2a{fb+llfb). (30) 


To obtain the 2nd moment of the fraction of the area of A not covered, we calculate an 
integral similar to (17) or (18). The function 


is derived from 13(r), which remains the same, but the frequency distribution fi(r), the dis- 
tance between a pair of points chosen at random in the rectangle, is different. 

The co-ordinates % and are uniformly and independently distributed in the range 0, 
fb (if we take Ox parallel to the longer side). The distribution of is thus seen 

from (25) to be _ \^^ilh)du 

•’ ~ f{ulb) b 


fb-.fu 

bfu 


dti, 


while the distribution of r = (Vi-y^Y is 




l-V(to) 

f{bv) 


bdv. 


(31) 

(32) 


* This pcoblem was solved by Robbins (1946); see footnote on p. 1. 



F. Garwood 


7 


The distribution o£r = ^J{u + v) is obtained by integrating the product 

(V6-»(i-V(M) 

-^(buv) 

over that part of the line u + v = within the rectangle 0 <u< 1/6. This gives 

(p(r) = (pj^{r) = 2r[7T-2r(^Jb + <J(llb)) + r^] for r<lj^b, (33) 

(j>{r) = (l>z{r) - 2r[2a- 1 jb-2r^b{l- cos a)] for l/^6 <r<^ 6, 1 

( ("4) 

where a = sin-’- l/r^6, J 

and 96(r) = = 2r[2(a — y5) — 6 — 1/6 + 2j- sin /?/^6 + 2r cos a - r^] 

for V^<r<V (6 + l/ 6), (35) 

where /? = cos-’i/6/r. 

Thus the 2nd moment of T can be found firom (19)-(21) together with (30) and (33)-(36). 


7. Overlap op rectaroles on a fixed rectangle 

Assume that the fixed rectangle A has sides a and 6 and the covering rectangles G have sides 
a and /?. The latter are assumed to be dropped with sides a parallel to the side a and with 
their centres anywhere inside the rectangle T, which is concentric with ab and has sides 
a + a and 6 + /i. To calculate the 2nd moment of the fraction YjAoiA not covered by k G’s, 
we use (18) and calculate the expectation of (Zl*!,?/!, the fraction of T not covered 

by two O’s with their centres 2/1) and (a;^, falling at random in The area common to 
two G's is readily seen to depend only on the difference g of the x co-ordinates of their 
centres and on the similar difference y of their y co-or<linates. In fact, the area can be 

^z,y^) = [a-iW-yl (36) 

where the symbol* [»] stands for x when a; > 0 and is zero when a; < 0, and we obtain 


9.^X1., 2/1. aJa, 2/2) = 1 + -- 


(37) 


[a + a) [b+P) 

To obtain the expectation of we need the frequency distribution of ^ and y. As in §5, 
^ is readily seen to follow the frequency distribution 

2(a-^) 


df^- 




'di 


between 0 and a, with a similar distributon for y, and we obtain the result 




1 + I (a _ g) Q) _ y^^dy. 


(38) 


(39) 


'ojol (a-fa) (6-f yd) 

If the fcth power be expanded, the resulting integrals are, with the exception of the first, the 
product of integrals whose upper limits are a' = min (a, a) and 6' = min (6,/?) respectively. 
We obtain 

4 [a-i)[P-y)-2apV‘ , 

[1 + ^- 7 ^'^" n {a-i){b-y)d^dy 


y',{YIA) = 




0 j 0 


4 


{a + (x){b+p) 


{a + a) (b + P) 

2ap 


' ra rb 

(a- 

Jojo 


a' I'b' 


i){b-y)d,^dy- | {a-^){b-y)didy 


0 j 0 


( 40 ) 


* The -writer is indebted to JSTeyman & Bronowski for this convenient notation (see below). 



8 TU variance of the overlap of geometrical figures 

By a simple change of variable we obtain 




s 


'« rfi 
(a-ajJ [/ 


ta-alJ[/?~61 


+ 


ill 


2a/? 


(a+cc)0 + /J} 


{b>[a-ccr + a^lb-/)}^-[a~aT [6 -/??}, (41) 


which is the result obtained by Bronowski & Neyman by a rather different method * 

8. Overlap oe cirolks on a eixed circle 

We now consider a fixed circle A of unit area and therefore of radius () = l/-^7r, with k circles 
U of radius a dropped at random with their centres uniformly distributed over a circle T 
of radius a + b. The 2nd moment of the fraction Y}A oi A not covered is, as in §6, the 
expectation, of q'‘[r), where q{r) is the fraction of T not covered by two circles with centres 
falling at random in A a distance r apart. We have 

\ + 2a^j^T~ na^ +_p{‘>‘) 
l + ’Za^n+na^ ’ 

where Q{r), the overlap of two circles radius a with centres apart, is given by (19), (20) 
and (21) as before. We thus need the frequency distribution ^(r) of the distance between two 
points chosen at random in the circle of unit area to obtain 

riV” 


q{r) = 


(42) 


fi'AYlA) 


■i: 


q'‘{r)(j){r)dr. 


(43) 


To do this we use a fairly straightforward geometrical method, finding first the probability 

integral rr 

F{r) = j ^(r)(lr, (44) 

which is the probability that the distance between the two random points is less than r. 

The probability that the first point is between v and v + dv from the centre is 2vdvlb^, 
while if 

A[v) = area common to circles radii b and r with centres distance v apart, (45) 
it follows that the probability of the second pointbeing within r of the first \BA[v)lvb'^. Hence 

nb^ 


F{r) = C 
J 0 


-dv. 


(46) 


Construct the triangle with sides r, b and v, and let tlie angles opposite to these be 6, ^ 
and ijr. Then the following can be readily verified: 

Ifr<6, A{v) == b^d + r’^^ — brsmfr if b—T<v<b,] 


(47) 

(48) 


~nr^ if 0<v<b~r. 

lfb<r<2b, A(v) = b^O + r’^tp — breini/r if r — b<v<b, 

— Trb^ if 0<v<r — b._ 

The integration in (46) is carried out by parts, with if as the ultimate variable of integration, 
and to do this we obtain the result 

:?/)»• ai n ?//■ 

(49), 


A'(v) = 

V 


Putting r = 2b sin {a, 

we obtain, over the whole range of r, 


J’(r) = 

TJ ' ^j-n 2 

* And by Hobbins (1946). 


(60) 


( 61 ) 



F. Garwood 


9 


while differentiation yields the frequency distribution as 

<f>{r) = 2r{7r — a) — - . (4r^ — 7rr*). (62) 

/» COS "^CC 

Tt will be noted as a matter of interest that the chance of the two random points falling 

3 /3 

further apart than the radius of the circle is 1 — F{b) — or 9/22 nearly.* 

We thus obtain the 2nd moment of the uncovered area from (42), (43) and (52). 


9. Use OF PEOB ABILITY GENEBATING FUNCTIONS 


(i) Binomial 

It is interesting to apply first the binomial distribution of fc, the number of O’s dropped 
uniformly and at random on T. Assume that T contains the centres of all the (7’s which 
touch or cover A, and that 8 is some larger area including T. If I C's are dropped at random 
on 8, the probability generating function of the number of centres falling on T is 

j S-T + Tu Y 


(?i(it)sl 


(53) 


Thus, from the general result of §4, the mth moment of the fraction of A uncovered is the 


expectation of 


i8-T+Tq 


, where q is the fraction of T uncovered by m C’s falling on A. 


But the expression within brackets is the fraction of 8 uncovered. The use of the binomial 
generating function is thus verified. 

(ii) Poisson 

The Poisson distribution next suggests itself. If the number of centres follows this dis- 
tribution with a mean of A per unit area, the probability generating function is 

(?2(tt) = e^2’<''-« (54) 

and the mth moment will be expectation over A of 

gATte-l)^ 


Alternatively, we could write this as 

/C(r/A) = exp(e-''^), (55) 

where Z = area of overlap of m (7’s falling on A. In particular, 

mean value oiYjA— ju,[{YIA) — (56) 

Thus the mth moment of YjA is related to the characteristic function of Z, but this result 
does not appear to be of any theoretical importance: it does not, for instance, throw any light 
on the frequency distribution of Y/A. Formula (66) does, however, demonstrate the fact, 
which is otherwise obvious, that the area T does not enter into the frequency distribution of 
the fraction of A not covered by (7’s whose fall follows the Poisson distribution. 

As far as the calculation of the variance is concerned, we need to calculate first the 2nd 
moment of In the cases where the falling areas are circles, the area of overlap Z of two 
circles is equal to 20~Q{r), where Q{r) is the function given above ((19) etc.) for the area 
common to two (7’s with centres r apart. The 2nd moment is thus 

g-2AO reAfi(r) ^(y) (57 ) 


where 0(r) is the frequency function of r, the formulae for which ar§ given above for the 
various oases. 

* The solution to this problem (no. 698), given by Whitworth (1897), contains an error, resulting in the incorrect 
value of 35/88 nearly. 



10 The. variance of the overlap of geometrical figiire.3 

In the case of rectangles falling on rectangles, the necessary formula for the variance k 
given by Neyman & Bronowski in the form of a series, viz. 

M s ! (s + 1 + 2)2 

X {(« + 2) a - a + [a - a] ( 1 - a/a)" ' i} {(s + 2) 6 - // + [// - 6] ( 1 - 6//i)«+i} . (68) 

Table 1. Fariancc of fraction affixed area not covered by arms G 
falling according to Poisson distribution 


(i) Circles falling on square j (ii) circles falling on rectangle 2 x 1 1 (iii) circles falling on rectangle 4x1; 
(iv) circles falling on circle; (v) aquaR's falling on square (sides parallel). 


Size of falling 
area O'-^-size 
of fixed area 

Case 

Mean area not covered 

0'2.'> 

_ J 

0-75 

Variance of area not covered 

0-2 

(i) 

O'OIHII 

0-0303 

1 

0-02.54 


(ii) 

0-0 182 

0-i)20(i 

0-0248 


(iii) 

D-oitm 

0-0275 

0-0229 


(iv) 

0-01 90 

0-0310 

0-0200 


(V) 

04)182 

0-0300 

0-02.52 

1-0 

(i) 

IMKKW 

0-0904 

0-0789 


(ii) 

04m73 

0-0904 

0-0743 


(iii) 

(MHSil 

l)-O70(i 

0-0027 


(iv) 

04)fi‘20 

0-II9K3 

0-0810 


(v) 

oimi);) 

0-0945 

0-0781 

1-8 

(i) 

OOKIO 

0-1262 

0-1U20 


(ii) 

04)771 

0-1193 

0-1040 


(iii) 

0-06r)7 

0-1015 

0-0824 


(iv) 

0-0829 

0-1281 

0-1040 


(V) 

0-0790 

0-1230 

0-1002 


In general, the variance increases in the follow'ing order as between the different combinations of shaia's: 

(1) Circles on rectangle 4x1, (iii). 

(2) Circles on rectangle 2x1, (ii). 

(3) Squares on square, (v). 

(4) Circles on square, (i). 

(6) Circles on circle, (iv). 

There is an exception in the last case considered, however (mean fraction of area not covered = 0'7,3, size of 
falling area-r fixed area = l’8), when the order is changed somewhat. However, the variances are generally of the 
same approximate magnitude, 

(iii) Contagious distribution 

Neyman & Bronowski have included in their study tlie case of a contagious law of typo A 
with two parameters (see Neyman, 1939). Here the probability generating function is 

G^iu) = (69) 

They have pointed out that the expression of this as a series enables calculations to be utilized 
from the Poisson distribution. 

In the general case the sth moment of the fraction of A not.covered is the expectation of 

( 60 ) 

where Z is the area common to s O’s falling with their centres on A. 




F. Garwood 


10. Numbeicai, results 

It is impossible to calculate complete tables covering all cases, but it is of interest to calculate 
a few values for the purpose of comparison, and the following combinations of areas have been 
considered: 

Fixed area Falling areas 

(i) Square Circles 

(ii) Rectangle 2x1 Circles 

(iii) Rectangle 4x1 Circles 

(iv) Circle Circles 

(v) Square Squares (with sides parallel to fixed square) 

In each case the fixed area has been made of unit size, while the falling areas were respectively 
G = 0-2, 1-0 and 1-8. The areas were assumed to fall according to the Poisson distribution, 
the number of centres per unit area being such that the expected fraction of the fixed area 
A not covered was respectively 0-25, 0-5 and 0-76. Since from equation (56) the expected 
fraction of A not covered is m = the relations between A and C for the fl combinations 
of m and Care AC = loge4, log, 2 and log, 4/3. 

The 2nd moments were determined by quadrature from formula (57), where the falling areas 
were circles, and by direct evaluation of the series (58) for the case of squares on squares. 
The results are given in Table 1. 

1 1 . Experimental investigation 

Before the work of Robbins and Neyman & Bronowski was brought to the notice of the 
writer, an attempt was made to obtain experimentally a general formula for the variance of 
the fraction of area not covered. Attention was confined to the case of circles falling on 
squares, the centres of the former being chosen randomly (by means of random numbers) 
within the area T whose boundary is at a distance a outside the sides of the unit square A. 

For each combination of C and k, samples of up to 200 in size were drawn. The various 
combinations were as given in Table 2. 

Table 2. Ranges of k and G covered in experimental determination of variance 



Area of circles dropped ^ 
Area of square 

No. of circles = k 

0'OG77 

6, 10, 16 

()-031 

6, 10, 1,6. 20, 30 


6, 20, 40, 80, 120 

0-26 

1, 2, 4, 6 

1 

1, 2, 4, 6 



Three methods were used to measure the fraction of the fixed square not covered in each 
sample. Method P involved the measurement of the covered area by planimeter. Method 
L utilized a photoelectric cell to measure the amount of light passing through a glass plate 
on which black paper disks had been stuck. Method G consisted of a simple count of squares 
on graduated paper, and generally this was the most convenient to operate. (The two neigh- 
bouring values of G were used to compare methods L and P.) 

For each combination of G and k the average fraction not covered, Y, was compared with 
the theoretical value m = (1 — GjT)’^. (61) 






12 


The variance, of the overlap of geometrical figures 

Tho observed standard deviation of the obH^vations being s, the aiipropriate criterion for 
testing the mean is < _ ^ IT 

- sl^P ’ 

P being the number of observations in the sanuple. The results are given in Table 3. 

Table 3. Comjmison between observed and expected -values of fraction of area not covered 


P = planimeter method, L = photoelectric method. C counting method. 


Area of circle 

No, of 
uirdes 
k 

No. it! 
sample 

P 

Moan area not covered 

Standard 

deviation 

6 

Deviation t 
y 

~ »Up 

Method 

of 

measure- 

ment 

Area of square 

= c 

Observed 

Y 

Kx|>cctod 

in 

■■I 

5 

100 


0-9683 

0-0055 

2-0000* 

P 


10 

100 

0-9394 

0-9376 

0-0088 

2-0464* 

P 

0-0077 

15 

100 

0-910.5 

0-9079 

O-OOIH) 

2-8889t 

P 

0-031 

5 

100 



0-0222 

-1-6667 

P 

0-031 

10 

200 

0-8007 

0-80.30 

o-om 

-0-9739 

P 

0-031 

16 

100 

0-7 m 


0-0404 

- 1-6347 

P 

0-031 

20 

100 

0-6349 

0-6449 

0-(I423 

-2-.3641* 

P 

0-031 

.30 

100 

0-6067 


0-0493 

-2-4746* 

P 

0-033 

5 

100 

0-8868 

0-8901 

0-0250 


L 

0-033 

20 

100 

0-0326 


0-0500 


L 

0-033 

40 

100 

0-3969 

0-3941 

0-0367 


L 

0-033 

80 

76 

0-1017 

0-1663 

0-0358 


LmdP 

0-033 

120 

60 

0-0657 

0-0812 

0-0266 

1-2478 

P 

0-25 

1 

30 


0-8960 

0-0824 

-0-2260 

a 

0-26 

2 

110 

0-7863 


0-1214 

-1-3477 

U 

0-25 

4 

no 

0-6263 

0-6416 

0-1481 

-1-0764 

C 

0-25 

6 

no 

0-4891 

0-51.38 

0-1307 

-1-9820* 

a 

1-0 

1 

20 

0-7683 


0-2360 

-0-1332 

a 

1-0 

2 

50 

0-6890 


0-2346 

0-1025 

a 

1-0 

4 

60 

0-3861 


0-2334 

1-3067 

G 

1-0 

6 

60 

0-1686 

0-2008 

0-1630 

-1-3968 

a 


* Between 6 and 1 % ievele. f Beyond 1 % level. 


An examination of the values of t shows that too many of them are outside the 6 % 
significance levels, while in each set corresponding to one value of G the values are too 
frequently of the same sign. The worst deviation is for C = 0-033 and fe = 5, with t = ~ 5-92, 
but this only corresponds to a difference between the observed mean of 0-886 and the 
theoretical mean of 0-890. The test is thus very sensitive and the deviations are not serious, 
and they arise from imperfection.s in the technique which have not been investigated in detail. 

As regards random errors of measurement as distinct from bias, it was not passible to 
oarry out a systematic estimation of the contribution of this source to the total variation. 
A series of repeated measurements for the case 0 = 0-031, 11: = 30, for which the observed 
mean was 0-51, showed that the individual measurements had a standard error of about 
0-009. As the total standard deviation in this case was 0-049, the true estimate of the standard 
error (i.e. omitting the error of measurement) was ^(0-049^-0-0092) = 0-04 S, and for our 
purposes this difference is negligible. There is thus some evidence for assuming that this 
method of estimating the variance was satisfactory. 

12. Deeivation oe empirical formula for the variance 
The consideration that the theoretical variance cr^ of the fraction uncovered must be 
small whenever the mean ni of this fraction is near the limits of its range, zero or unity, 
suggests that we might try the relation 

(T2~m{l — to) 













Table 4. Derivation of empirical formula 


F. Garwood 


13 


Percentage 
error in 

1 1 M 1 1 1 1 1 1 1 1 1 

1 ' 1 1 1 

P 

I’. 

1 

/P 

error m 

i>-0(Mia5COcocoas 

|jJ||||||||||o<N<ilAt~«cnA 

Exact 
value of 
variance 

—cr^ 

C0C0<£>U500r-HO05 

1 1 1 1 » . 1 1 . 1 . . .t^cqt^cooifs^cc 

1 M M 1 1 1 1 1 1 1 1 ssssssis 

oooooooo 

Empirical 
value of 
variance 
2-3m{l —m) 

O" 

“ II 

: 

CO c<» o 

<C* ^ 'rti U5 f-i 

e0<0^00c00p00'«t«00 00c<^»^<£><00> 
Q^S^'^'^^wt>*C0CX)O'^C0T4*O<:0i— (<000 

O e— ( 1 — t r—t lO <IN 00 'C3i lr“ CO oS M 

SSSSSSSSSSSSSSSSSS8SS 

OOOtOOOOOOOOOOOOOOOOOO 

&1. X. 

2- 10 

2-38 

2-51 

2-41 

2-05 

C o 
g 8 oJ 

JgS » (( 

"3 

> 

0-00108 

0-00759 

0-00876 

0-0820 

0-234 

"lo 

1 


t'-OC0O’5t<OO0q00^'^'^0«M — — “ '■■ • 

OtNCRCOO*— ‘OOt^COt^COl^COCO'ei*'T}4i-i — ~ ’ ■"■ *• 

0»-^OIC C^QOt— ^•:0OW50>>-H(Mijqi0C0l'-C0>-H<0 
^OOpOppOp'-<Opi— <r^<35o><oOC'|^<D 
OOOOOOOOOOOOOOOOOcoC^JfM-H 
OOOOOOOOOOOOOOOCjOOOOO 

Observed 

If 

J 

3 oj 

3 

\ " 

CO O 

O t'- ' CO MO o 

C0l-~ 000:<MC0O5C0«MOM5c0M00S 

000’^p-i«:0l— •^CO»QCOCSCDt-t— 05»-4<M0 1C5«> 

ssissssssssssgggsgisig 

ooooooooooooooooooooo 

Ratio 

TjO 

r-“ I - r-H 1— < t— ( ip lo »p »o Ip lo MO lo CO CO CO CO 

ir3iOLbcbcb«oco<bcbcocococbd>d>o?os'i<'^TH'^ 

Xo. of 
circles 
k 

lOO»0>OOiOOOiC>0000»-«<M’-!j<t£)p-tcq'^CO 

1 


I 

CO 

‘S II 

c6 

cu 


tr- l> 

r-— i-W r** f~* CO CO CO CO CO 

ooococofocococooocococ<^icsiou:}iio 
OOpOOOOOOOOOOC^C^C^IC^OOOO 
OOOOOOOOOOOOOO O'O O 1 -^ 


< 05 CO CO CO CO 






















F, Garwood 


16 


for given C. Accordingly we have calculated the quantity 

_ ** 

^ m(l— m)’ 


( 02 ) 


and the results are given in Table 4. 

For each value of 0 the values of g arc by no means constant, but the variation is not 
excessive, and it is considered that for practical purposes we can take g to be a function 
of 0, at least over the range considered. To obtain a suitable form for this function, it was 
decided, for very general reasons, to seek a simple relation between g{0) and TjC, the latter 
being roughly the number of G'a which could be placed on T if they could be fitted together 
without overlapping. 


Table 5. Comparison of empirical formula for variance with exact value 


= exact value, cr-^ = empirical value, E = percentage error = 100 (o-i® — £r®)/iT“. 




ifc = l 

■— 

■■■ 


* = 4 

k = 5 

A = 0 

0'2 


0-00601 

0-00873 

0-0114 

0-0132 

0-0144 

0-0160 



0-00610 

0-00896 

0-0117 

0-0135 

0-0147 

0-0164 


E 

+ 3-0 

+ 2-6 

+2-6 

+ 2-3 

+ 2-1 

+ 2-7 

04 

0-2 

0-0166 

0-0246 

0-0290 

0-0305 

0-0300 

0-0274 



0-0140 

0-0237 

0-0284 

0-0304 

0-0305 

0-0294 


E 

-4-5 

-3-7 

-2-1 

-0-3 

+ 1-7 

+7-3 

0’6 


0-0277 

0-0403 

0-0447 

0-0427 

0-0395 

0-0339 



0-0267 

0-0384 

0-0431 

0-0433 

0-0407 

0-0370 


E 

-7-2 

-4-7 

-3-6 

+ 1-4 

+ 3-0 

+ 9-1 

0’8 

(t2 

0-0398 

0-0641 

0-0552 

0-0601 

0-0428 

0-0351 



0-0305 

0-0517 

0-0661 

0-0525 

0-0471 

0-0407 


E 

-8-3 

-4-4 

-0-2 

+ 4-8 

+ 10-0 

+ 16-0 

I’O 

0-2 

0-0508 

0-0661 

0-0628 

0-0640 

0-0436 

0-0339 



0-0471 

0-0636 

0-0648 

0-0690 

0-0507 

0-0420 


E 

-7-3 

-2-3 

+3-2 

+ 9-3 

+ 16-3 

+ 23-9 

1-2 

0-2 

0-0605 

0-0737 

0-0677 

0-0654 

0-0427 

0-0317 


<ri2 

0-0571 

0-0740 

0-0724 

0-0636 

0-0626 

0-0419 


E 

-.5-6 

+ 0-4 

+ 6-9 

+ 14-6 

+22-9 

+ 32-6 

14 

0-2 

0-0689 

0-0804 

0-0706 

0-0554 

0-0410 

0-0292 


Vl2 

0-0667 

0-0832 

0-0786 

0-0665 

0-0531 

0-0411 


E 

-3-2 

+ 3-5 

+ 11-3 

+ 20-0 

+29-5 

+40-7 

1-6 

0-2 

0-0763 

0-0866 

0-0723 

0-0546 

0-0389 

0-0267 


o-i” 

0-0757 

0-0913 

0-0834 

0-0684 

00530 

0-0398 


E 

-0-8 

+ 6-8 

+ 15-4 

+ 25-3 

+ 36-3 

+ 49-1 

1'8 

cr^ 

0-0828 

0-0895 

0-0730 

0-0631 

0-0367 

0-0244 


Vx2 

0-0843 

0-0St85 

0-0873 

0-0095 

0-0624 

0-0383 


E 

+ 1-8 

+ 10-1 



+ 19-6 

+ 30-9 

-i-42‘8 

+67-0 


Fig. 2 shows the result of plotting the observed mean value of g(C) against TjG on log- 
arithmic scales. The points (the values of s®/m(l — m) for various values of k are plotted in 
addition to the mean), lie reasonably close to a straight line of slope — 1-5, indicating the 
relation g(G)~(GIT)K (63) 

The values of {TjG)^g(G} are slrown in Table 4, where the values are seen to lie between 2 
and 2'6 with an average of 2-3. Thus we derive the rough empirical formula 


2 2-3m(l — m) 

" (T/G)i "■ 

TO= (l-C7/r). 


where 


( 64 ) 






16 


The variance of the overlap of geometrical figure,^ 

These are given in Table 4, together with the values in some cases of the percentage error of 
o-\ compared with the true value cr^ obtained from the methods described in §5; the per- 
centage error in the estimate 8^ observed experimentally is also given. (It was not possible to 
evaluate cr® in all cases, as the computation is somewhat laborious.) Another set of com- 
parisons between the empirical and the exact formula is given below in Table 6, over the 
range C = na'^ from 0-2 to 1-8 and k = 1,2, 3, 4, 6, 6. 

It will be seen that the empirical formula gives quite a satisfactory fit, e.g. with an error 
less than 10 %, over a considerable part of the range studied, but that the error tends to 
increase, i.e. the formula exaggerates the variance, as C and k increase. 


13. Use of the empirical formula for the Poisson case 


If k circles are dropped with their centres falling at random on the area T the mean area not 
covered can be written as = (1 — CIT)’‘, 

and the empirical formula for the variance as 

2-3m(I-)[l-m{A;)] 




(T/0)3 


Table 6. Comparison of empirical formrda for variance of area not covered 
in case of circles falling on square according to Foisson distribution, 


Srae of falling 
area 0 -i- 
flxed' area 


Mban area not covered, m 

0-26 

0-6 

0-75 

0-2 


0-0186 

0-0303 

0-0254 



00196 

0-0308 

0-0257 


% error 

5-4 

l-l 

1-2 

10 

0-2 

0-0608 

0-0964 

0-0789 



0-0669 

0-0981 

0-0781 


% error 

10-0 

1-8 

1-0 

1'8 

O'* 

0-0816 

0-1262 

0-1026 



0-0942 

0-1348 

0-1057 


% error 

15-4 

6-8 

3-0 


Hence if k follows, the Poisson distribution with expectation ^7’, the total variance based 
on the empirical formula is 

0 - 2 = S 

fc-n 

where m = expected area not covered 

= (05) 


We find after expanding m\k) that 




m 


t 


{Ticy-j 


m' 


-GIT 






2- 3m 


( 66 ) 


[TjCf-' 

This formula is compared with the exact values in Table 6 over the same range as in Table 1. 




F. Garwood 


17 


The agreement is again reasonably satisfactory over the greater part of the range, large 
positive errors occurring for large values of G and small values of m. These errors might he 
redueed by using a constant rather smaller than 2-3 in the empirical formula, but the point 
has not been investigated further. 

StJMMARY 

The mathematical study of bombing has given rise to the following problem. A fixed 
outline, such as a square or circle, is drawn on a plane, and other similar outlines are dropped 
at random on it. Estimates are then required of the variance of the fixed area which is not 
covered. Work by Robbins enables a theoretical formula to be derived, and Bronowski 
& Neyman have treated, by an independent method, the special case of rectangles falling 
on rectangles. 

It is shown that in the case of circles falling on circles, squares or rectangles, the variance 
can be expressed as tiie integral with respect to r of tlm product of two functions, one being 
a simple function of the area of overlap of two circles with centres r ai)art, and the other 
being the frequency function of the distance r between two points chosen at random in the 
‘ covered ’ area. This applies both to the case where the number of falling areas is fixed and 
where it follows a Poisson distribution. Numerical values have been calculated for a 
number of cases. An experimental method had been carried out prior to the above theo- 
retical work, and the following empirical formula was derived for the variance of the 
fraction of a fixed square not covered by k circles, area 0, falling at random on an area T 
containing centres of all G's which cover or touch the fixed square : 

2-3m(l— m) 

" (TIQ)i ■’ 

where m = mean fraction of area not covered 

= (1-C'/T)*. 

This formula, and its extension to the Poisson case, have been shown to be in reasonable 
agreement with the exact values over a considerable range. 

The writer is indebted to Miss G. 0. Jeffcoate for valuable assistance in the computing 
and experimental work. 


RKPERENGES 

Bhonowhki, J. & Nbvman, J. (1045). -4hw. Math. iHtatisl. 16, 330. 

Nkyman, (1030). Ann, Math. Statist, 10, .35. 

RoimiNS, H. 15. (1044). Ann. Math. Statist. 15, 70. 

Hodhinh, H. E. (104.')). Ann, Math. Statist. 16, .342. 

Whitworth. W, A. (1397). DOO ISxe.rdses in Ofunrs aiid CImnee. Cambridge. 


Uicimptrika 



[ 18 1 


A STUDY OF A FIRST DYNASTY SERIFS OF EGYPTIAN SKULLS 
FROM SAKKARA AND OF AN ELEVENTH DYNASTY SERIES 

FROM THEBES 

By a. BATRAWI, Ph D. and G. M. MORANT, D.So. 

1. Introduction. This paper deals with forty-four male crania of 1st dynasty date 
(c. 3400 B.f!.) discovered at Hakkara by Macramallali Effendi, who has published a report 
on the excavations (1940), and with fifty-five crania of 11th dynasty (c. 2000 b.c.) soldiers 
unearthed at Thebes in 1927 by the Egyptian Expedition of the Metropolitan Museum of 
Art, New York (Winlock, 1928). The cemetery at Sakkara, 20 miles south of Gizeh, was used 
by the middle classes of the local community. Prof. D. E. Derry has kindly provided the 
following notes on it ; 

Tlie 1st dynasty cemetery at SakUara exoavatetl by Macramallah Effendi is of special interest. 
Comparatively few cemeteries of this date have been found, and, while the total number of forty-four 
skulls from which reliable measurements could l)e taken was small, yet tlie results yielded by these 
are such as to show that we are deidhig with a race which differs in important features from those 
exhibited by the so-called prodynastic people. 

The observation that there wore two races in Egypt in the early dynastic period was first made in 
the year 1909, when the results of moOHurements obtained from a series of male and female skulls of 
the 4th and 5th dynasties from the great necropolis surrounding the pyramids of (liza came to be 
examined and compared with crania from early prodynnatic graves. Until then the theory of an. 
unbroken ovoUition of the Egyptian race froin prohistorio times right through the dynastic period had 
been, taught. It now became obvious tliat tlie culture which wo Itiiow of as peculiarly Egyptian was 
associated with araoo wliicli could not have boon derived from the proiiyiiastic people. Tlie introduction 
of stone-working rasulting in the erection of great tomb.s and statuary, as well os beautifully executed 
reliefs, paintings and above all writing, all pointed to a race far in advance of tlie predynastio people, 
who altliougli skilled in tlio making of bowls and vases in stone as well as in pottery, and who liad 
already attained to tlie discovery of the uses of copper, wore, nevertheless, little removed from the 
Neolithic period. 

The cemetery is unusual in consisting entirely of males. In the note on tlie skulls published in 
Macramallah Effendi’s report it is stated that there were some females included in the collection. After 
the report had gone to press Macramallah Effendi informed the writer that a part of the cemetery 
was of 18tli dynasty date. It turned out that all the female skvillH came from this part and that therefore 
the 1st dynasty cemetery contained only remains of males. Dr Batrawi’s examination of the figures 
confirms the statement made at the beginning of this note and shows the closeness of the relationship 
of the people of the 1st dynasty at Sakkara with those of the 4tli and 6th dynasties from Dealioslieh 
and Medum. 

In his report on the discovery of the series of 1 1th dynasty skeletons at Thebes Mr H. E,, 
Winlock (1928) says that they were found in ‘a tomb in the row where the grandees of 
Mentuhotep’s court had been buried’. He remarks: 

Obviously what we had found was a soldiers’ tomb. To judge from the cheapness of their burial they 
were only soldiers of the rank and file, and yet they had been given a catacomb presumably prepared 
for the dependents of the royal household, next to the tomb of the chancellor Khety. Clearly that was 
an especial honour. If we are right in supposing that ail had been buried at once, they must have been 
slain in a single battle. 

Prof. Derry examined the bodies on the spot, and he took measurements of the crania 
and of some of the long bones. About sixty bodies were counted and all proved to be those 



A. Batrawi and G. M. Mobant 19 

of adult males who had died in the prime of life. Prof. Derry says that the skeletons were 
reburied after they had been examined. 

2. The measurements of the crania. The Sakkara series was sexed and measured by 
Prof. Derry and we are indebted to him for allowing us to use his records in this paper. 
All the absolute measurements, given in Appendix II, are his readings with the exception 
of those of the foram,en magnum, which were taken by one of the writers (A.B.) of this 
report. The measurements of the Thebes series were also kindly provided by Prof. Derry, 
together with means he had calculated. The readings for individual crania are given in 
Appendix III. 

The technique of measurement followed by Prof. Derry is that of the Monaco Congress 
(Duckworth, 1913). He had used this when measuring the predynastic Egyptian series of 
skulls from Badari, of which part was remeasiu’ed later in London by Miss B. N. Stoessiger 
(1027), who followed the biometric technique. The two sets of measurements of the same 
fifty-three specimens have been compared (Morant, 1935), thus showing in detail what 
relations are to be expected l)etween readings obtained by following the two techniques. 
These results were taken into account in preparing the definitions of Prof. Derry’s measure- 
ments given in Appendix I below. The characters are denoted as far as possible there and 
in the tables by the cu.stomary index letters of the biometric technique. 

3. The nature of the two series. Mean measurements and standard deviations for the two 
series are given in I’afile 1. The longest aeries of Egyj)tian skulls measured, known as the 
E series, came from a ocnnetery at Giza used from the 26th-3()th dynasties (Davin & 
Pearson, 1924-). Judging from comparisons of constants for a number of cranial characters, 
most of the other ancient Egyptian series described exhibit almost precisely the same order 
of variation as the one from Giza. In general they have been found to be rather less variable 
than European cranial series, while there is no evidcmcc that there was any appreciable 
change in the variation exhibited by Egyptian populations during the long period from 
early predynastic to Roman times. 

The two now series are shorter than several from Egypt previously described. Counting 
the number of characters for which the standard deviation for one series is greater or less 
than the corresponding con.stant for the other serie.s, the situation is : 

Sakkara and Thebes: Sakkara .s.n. greater for nine and less for ten characters; 

Sakkara and Giza: Sakkara .s.n. greater for four and less for eleven characters; 

Thebes and Giza: 'rhebes .s.n. greater for eight and less for nine characters. 

This crude comparison suggests that thcr('. can liave. been no marked differences beWeen 
the variabilities of the three ])op\dation.s represented. As sets of differences are considered, 
the limit of significance accepted may be taken considerably higher than in the case of 
a single diflerence. Supj)ase that there is a real distinction if two of the standard deviations 
differ by an amount which is 3-5 or more times its probable error. Then one significant 
difference is found for the Sakkara and Thebes series {NH, L, Sakkara s.D. greater, 
d/r.E.d = 3-8), none for the Sakkara and Giza, and three for the Thebes and Giza series 
{H' 4-1, El 3-5, jS'a 3-5, Giza S.D. the greater in all three cases). The two new series are too 
short to give reliable compari.sons, but the evidence suggests that the populations they 
represent were equally homogeneous, while both were rather less mixed in racial com- 
position than the 26th-30th dynasty population of Giza. 



2Q Egyptian shiUs 

4. Comparisons ajmean riuasurmmts. Following i)it)mctric practice, it may be Kupposed 
in such a case that no f:tatistical analysis of the series can -reveal its racial comjmnents. 
The relationships of the series have to be judged tiy (jomparing them as wholes, on the basis 
of mean measurements, with other series known to exhibit unexceptional variation. It 
was shown by Morant (1925) that the recorded aeries of ancient Egyptian skulls can be 
divided into two groups. These were called, for convenience, the lipper and Lower Egyptian, 


Table 1. Means and sianilard deviations (mtli prolxdde. errors) of the Sakknra Isi dynasty 
and Thebes llih dynasty series of vmle, skulls 



Means 

Standard deviations 

Character* 

Sakkara 

Tliebas 

, Sakkara 

Thebes 

L 

180-9 -t O', fifi (41) 

181-8-8 0-.53 (.54) 

5-31 +0-40 

6-75 + 0-37 

B 

138-7 + 0-41 (43) 

138-3-8 0-41 (.54) 

3-99 + 0-20 

4'.52 + 0'29 

B' 

90-,') + 0-39 (30) 

93-0-8 0-43 (.55) 

3-48 + 0-28 

4-72 + 0-30 

IV 

135-4 + l)'07 (32) 

137-1 ±0-37 (51) 

,5-63 +0-47 

3-91+0-26 

[Aur, ht.] 

114-8-1 0-07 (27) 


5-20 + 0-48 


LB 

102-7 •r0'.57 (29) 

l(K)'7±0-34 (40) 

4-.57 + 0-4() 

3-40 ±0-24 

a 

518-8 -tl-.O (29) 

507-4-8 1-3 (50) 

12-0 ±1-1 

13-2 +0-89 


— 

125-7 + 0-47 (.52) 


.5-01+0-33 

-b'j 

— - 

129-4 + 0-56 (53) 



6-00 + 0-39 


— 

11.5-2+0-82 (51) 

..... 

8-63 ±0-58 


— 

370-5 +1-2 (.50) 


12-7 +0-80 

[Broca’s Q'] 


300-3 + 0-84 (49) 

-- 

8-72 + 0-, 59 

fml 

30-7-1- 0-26 (31) 

.... 

2-18 -8 0-19 


fmb 

30-4-1- 0-22 (29) 

— 

1-74 + 0-15 


[G'H] 

7 1-9 -1-0-, 55 (30) 

72-0 + 0-37 (45) 

4-43 + 0-39 

3-71 + 0-26 

OB 

90-5 ±0-62 (26) 

96-5 + 0-18 (38) 

4-.57±()-44 

4-40 + 0-34 

J 

127-8 (14) 

127-0 + 0-.52 (32) 

— 

4-33 + 0-36 

[NH, L] 

61'2-i-0-.50 (29) 

51-8 + 0-25 (45) 

4-00 + 0-35 

2-52 + 0-18 

NB 

25-4-1-0-21 (30) 

25-0 + 0-20 (42) 

1-70 + 0-15 

1-92 + 0-14 

Idi'j 

38-9 -t- 0-24 (20) 

39'l±0-!6 (44) 

1-84 + 0-17 

1 -.55 + 0-11 

[Od 

32'.5-K)-20 (20) 

33-1 + 0-23 (44) 

1-50 + 0-1 4 

2-23 + 0-16 

[Prosthion OL] 

99-0 -h 0-56 (20) 

96-5 + 0-47 (43) 

4-23 + 0-40 

4-61+0-34 

lOOB/L 

74-2 H- 0-26 (39) 

76-1+0-26 (.54) 

2-44 -8 0-19 

2-84 + 0-18 

100 H' jL 

72-8-1- 0-41 (30) 

7.5-6 + 0-29 (51) 

3'.37 + ()'29 

3-06 + 0-20 

100 BIH' 

102-6 -f 0-60 (31) 

100-8 + 0-41 (51) 

4-95 + 0-42 

4-34 + 0-29 

100 JmbI fml 

83-3 -80-72 (29) 

— 

5-76 + 0-51 

— 

[100 O'HIGB] 

74-3 -8 0-50 (26) 

76-8 + 0-53 (38) 

3-70 + 0-35 

4-83 + 0-37 

[100 NB/NH, L] 

49-5-8 0',58 (29) 

48-3 + 0-48 (42) 

4-63 + 0-41 

4-61 ±0-34 

[100 OJOf] 

83-0-8 0-61 (20) 

84-6 + 0-49 (44) 

3-86 + 0-36 

4-81 + 0-35 
" 


* The oharaolers are defined in Appendix I. A symbol in square brackets denotes oitlior that tlus measurement 
is one not usually included in the l)iomctrio technique, or else that Prof. Derry’s method of taking llio incasure- 
monta does not accord with biometric praotie,e. 


though there is evidence that the regions represented changed somewhat with time. The 
series in the first group came from the neighbourhood of Thebes and sites farther south, 
while those in the second group came from the same region of Upper Egypt and sites 
farther north. The first group includes all the predynastic series that have been described 
and some of dynastic date, the latest being of the I8th dynasty: the second group ranges 
from the 1st dynasty to Roman times, though no series available earlier than the 4th 
dynasty had come from the region immediately south of the Delta. The iSakkara series 
described in the present paper extends the range of such material back to the 1st dynasty. 






A. Bateawi and G. M, Moeant 


21 


It had been found that the means for all these series are almost constant for most of the 
metrical characters commonly recorded, but for a few measurements more significant 
differences are found and these separate the two groups of series. Characters of both kinds 
are treated in Table 2, which is based on Table XIII in Risdon’s paper (1931)) on the human 
remains from Lachish (Palestine). The first six characters are those which make the clearest 
distinction between the Upper and Lower Egyptian types of Herie.s, and they are all breadths 
or dependent on breadths— the latter being the horizontal circumference and the two 
indices— of the cranium. The Sakkara series is clearly assigned to the Lower Egyptian 
group, and if counted as a member of this the range of the mean minimum frontal breadths 
(B') for the group is slightly extended. The Thebes aeries is also assigned to the Lower 
Egyptian group by four of the six eharactens in (piestion: for U and 100 BjH\ however, 
its means fall within the ranges given for the Upper Egyptian type of series. 


Table 2. Ranges of mean measurements for two groups of series of ancient Egy 2 )tian 
male crania and means for the Sakkara and Thebes series* 


Series 

Period 

B 

J 

B' 

U 

Upper Egyptian type 
Sakkara 

Thebes 

Lower Egyptian type 

Early predya.-lHth dyn. 
1st tiyn. 
nth dyn. 

Ist dyn.-Uoiimn 

131 '4- 134-3 (10) 
138-7 

438-3 

135-3 -139-3 (9) 

123-6-127-5(8) 

127-8 

127-6 

127-5-131-3(8) 

90-4-92-8 (4) 
96-5 

93-6 

93-0-96-2 (5) 

500-0-510-4(4) 

518-8 

507-4 

510-8-518-7(6) 


Series 

Period 

100 lijL 

KXi lijir 


W 

Upper Egyptian type 
Sakkara 

Thebes 

Lower Egyptian type 

Early predyn.~18th dyn. 
1st fiyn. 

11th dyn. 

1st dyii.-Konian 

71-7-73-7 (10) 
74-2 

76-1 

73-7-76-0 (9) 

98-1-101-1 (10) 
]02-(i 
l(Kl-8 

102-3-108-4 (9) 

182-2-186-2(10) 

186-9 

181'8 

181-4-185'8 (9) 

132-4-135-9 (10) 
135-4 

137-1 

130-7-138-0 (9) 


* The charaotcra arc defined in Appendix 1. The numbers in liraekets give the number, s of series to which the ranges 
relate. In the case of tlicse previously described series the .smallest uumbor of crania on which any one of the means is 
based is 16, though this niiniiuuiu number is aliout 30 for moat of tlie eliarneters. Tlie iiumliers on which the Sakkara 
and Thebes means arc liaaed (.'an lie seen from Table 1, the only one leas than 26 being M for the bfeygoraatic breadth 
[J) of the Sakkara series. 


The last two characters in Table 2, which are the length and height of the cranium, fail 
to distinguish the two contrasted groups of serie.s. Tlie means for the two new series fall 
outiside the ranges previously given by all tlio ancient Egyptian material, the Sakkara 
series giving the greatest L and the Thebe.s the greatest //'. The evidence of other characters 
must be taken into account, but so far the comparisons suggest that the two new series 
are of the Lower Egyptian type, and it is to be expected that they bear a closer resemblance 
to some of the series assigned to that group than to any other cranial series. 

At the same time it may be noted that the Sakkara 1st dynasty and Thebes 11th dynasty 
populations are clearly differentiated by their mean cranial measurements. There are twenty 
characters in Table 1 for which meaas for both series are available. The most significant 
difference is for L, and it is 6' 6 times its probable error, while five other characters — B', U, 



















22 Egyptian skulls 

prosthion. OL, 1005/1 and 100 //'/5— aluo show tlifterences whicli exceed four times their 
probable errors. 

5. Comparisons by rmfficients of rac.ial likeness. Tlie method of Karl Pearson’s coefficient 
of racial likeness has been applied extensively to series of ancient Egyptian crania. Risdon 
(1039) has given comparisons made in that way for twenty-two male series, including three 
from sites outside Egypt, and the treatment below is almost re.stricted to comparisons 
between these and the two new .series described in the present paper. The procedure fol- 
lowed in applying the method described in several pajjers in Biomctrika was adopted without 
modification.* 

In deriving a classification of a number of cranial, or living, aeries from the coefficients 
of racial likeness found between them, it has been ahown repeatedly that the most sug- 
gestive arrangement is obtained if the clo.sest resemblances of the series, indicated by 
coefficients below a certain value, are alone taken into account. Risdon has given a diagram 
(1939, Eig. 3) showing all the reduced coefficients less than .'5-0 between the twenty-two 
sei'ies with which he dealt. There are fifty -three of this lowest order among the 231 
( = 22 X 21/2) comparisons. The addition of the two new aeries to the classification referred 
to only requires a knowledge of the reduced coefficients less than 5-() between them and 
tlie twenty-two series. 

It has been pointed out that inspection of a few mean measurements can indicate whether 
a comparison of two particular series would almost certainly give a reduced ooeffioient 
greater than the limit chosen, or whether it might provide a value less than r)-0. The 
measurements used for this rapid test are six which are known to be those whicli show the 
most significant differences, and the greatest proportions of such differences, in comparisons 
of the group of series. These are the length, breadth and height of the brain-box and the 
three indices derived from these chords. For the fifty-three comparisons of the twenty-two 


♦ A ‘crude’ cooflicient ia defined by 


m \ji,+n; (T.a J 


1 + 0-6745. 


where M, ia a mean based on n, crania for the first series, and are the oorroaponding constants for the 
second series and m characters are oompatod. The o-’s of the long 26th-30tli dynasty Egyptian series were 
used throughout. The crude coefficient may be written 


-/Sf(«)-l + 0-6746 


V m’ 


whore 


», + < O’/ 


Its. value ia largely determined by the sizes of the two samples that happen to be available, if in fact they do not 
represent the same population. As many excavated crania are damaged to some extent, in the ease of a particular 
series means for different characters will usually bo based on various numbers of spccimons (see Table 1). The 
mean number available for the characters used is denoted by K, in tlie case of the first series and by in the 
case of the second series, and these ‘ sizes ’ of the samples are usually unequal and may bo of very different orders. 
To obtain, as far as possible, a measure of the absolute divergence of the types compared which does not depend 
on the numbers of crania available, a ‘reduced’ coefficient of racial likeness is computed. This is dolinod to be 


100x100 
100 + 100 



1 +0-6746 



A reduced coefficient may be supposed a good approximation to the value which would be obtained if all the 
means for both series were for 100 individuals instead of for the numbers actually available. If a crude 
ooeffioient differs from zero by less than 3-5 times its probable error — a rare occurrence — then it is supposed 
that there is no evidence of a significant distinction between the two populations represented. In this case 
there is no need to compute a reduced ooeffioient. Otherwise, reduced coefficients are found and the classification 
of a number of series is based on these. 



A. Batbawi and G. M, Moeant 23 

series giving reduced coefficients less than n-O, the maximum differences for the six 
characters (in mm. or units of the indices) are: 

L K H' I(H> lijL UH) H'lL 100 BtH' 

3-1 3-0 3-5 2-0 ■ 2-3 2-8 

To avoid the danger of missing comparisons which might be of the order required, in 
applying the test each of these values was increased arbitrarily by ()-2 giving: 

L H II’ KK) UjL UM) H'lL 100 BIH' 

3-3 3'2 3'7 2-2 2-r) 3-0 

In comparing a new aeries with the twenty-two it may be supposed that a reduced 
coefficient of racial likeness greater tlian oTI woidd almost certainly be found if the dif- 
ference between the means is greater than tlie accepted limit in the case of any one or more 
of the six characters. For such comparisons tlie coefficients were not calculated. If the 
differences between the. means arc less than the limits for all six characters then a reduced 
coefficient less than ff-O mi(iht bo found: the coefficients wore calculated in all such cases. 
In this way detailed coniparison.s were judged to he reepured Ijctween the new Sakkara, 
1st dynasty, series, on the. one hand, and six of the twenty-two treated by Risdon on the 
other; and between the no.w Thebes, 1 1th dynasty, on the one hand, and only two of the 
twenty-two series on the other. TTie i)revionsly descril)ed series involved in these two sets 
of comparisons — one series bf'ing included in Iiotli sets — are: 

(i) ,De,shasheh and Medurn, 4th and /5th dynasties (Thomson & Maciver, lilO.'j), Tlio two 
towns are south of Sakkara and both less than 40 miles from it. 

(ii) Gizeh, 2(5th -:}{)tl) dynasties (Davin & Peanson, 15)24). 

(iii) Sedment, 0th dynasty (Woo, 1030). 

These three and the new Nakkara series are all from Lower Fgypt among the total 
twenty-four series referred to above. Al! the other Egyptian sites mentioned are in Middle 
Egypt and close to Ahydos and Tholjcs. 

(iv) Abydos, 18th dynasty (T'homson & Maciver, 15)().'5). 

(v) _Abydos, 1st dynasty, royal tombs (Morant, 15)2.'5), 

(vi) Lachish, Palestine (Risdon, U)35)). This series represents an Egyptian population. 
It is assigned to the seventh and eighth centviries b.c., tliougli it is not well dated. 

(vii) Tigre district, Abyssinia, modern (>Sergi, 1012, means given in Morant, 192.'5), 

(viii) Cretans, modern (von Luschan, 1013, means given in Woo, 1030). This series is 

not one of the twenty-two dealt with l)y Risdon. It was included because of its close 
resemblance to the new 1 Ith dynasty series from TTiebes, 'ITic test based on a comparison 
of the means of the six calvarial measurements shows that the only ancient Egyptian series 
which might give reduced coefficients less than fi-O with tlu*. Pretan series are the Theban 
11th dynasty and the .Sedment series ((iii) above). 

It must be emphasized that a reduced eoefficieiit of racial likemas le.ss than r>-0 represents 
a very close degree of re.semblance. Values of that order have only been found between 
cranial series which would be expected, on account of their provenance, to repre.sent the 
same or closely related population.s. There is a danger that low reduced coefficients may bo 
misleading owing to the influence on them of extraneous factors, such as inaccuracy in 
sexing or slight and unappreciated differences between the methods of measurement of 
two recorders w'orking independently. It is safe to suppose that the two new series are 
made up entirely of the crania of adult males. In computing coefficients with them care 



24 Egyptian skulls 

was taken to restrict a particular comparison to pairs of means based on measurements 
obtained l)y following precisely the same technique. 

Owing partly to that restriction, the numbers of cliaracters that could be used in com- 
puting coefficients with the new series are decidedly smaller than the 31 used ideally for 
tlie purpose, For these comparisons the smallest number of characters used is fi and the 
largest number is 18.* Ilisdon (Hiffi), pp. 131-2) has examined the matter experimentally 
and he concluded that use of a smaller number of characters — the set of 14 he considered 
being very similar to the seta wo were able to u.se— can usually be expected to give a fairly 
close approximation to the result which would be obtained from about twice ms many 


Table 3. Onefficienls of racial likeness between ancient Egyptian, a Palestinian {Lachish) 
and modern series of male skulls from Abyssinia and Crete* 


Series 

Crude 
tMt.L. ± r.K, 

Reduced 

Sakkara, lat dyn. (32'ljwifh l)caliaa!iohandMpclum,4thand5thdyn. (4(i'0) 
(32-1) with Abydns, IHtli dyn. (49'9) 

(31'(i) with I.aoliisli (24ib.3) 

(3 Id!) with Gizeli, 2(lth-3(itli dyn. (885-7) 

(3l-(i) with modern Abyssinian (((1-4) 

(31 -IS) with Abvflos, Ist dyn. royal toinirs (33-fi) 

(30-3) with Tiieiies, Hth (lyn. (4«-7) 

0- 19 ±0-32 (9) 
- 0-08 -t- 0-32 (9) 

1- 04 0-25 (14) 

2- 45 + 0-25 (14) 
2-43 + 0-25 (14) 
1-91+0-25 (14) 
-(-2fi±0-22 (IH) 

1-85 + 0-45 

4- 02 + 0-41 

5- 82 + 0-60 
5-87 ±0-77 

1 1-45 ±0-59 

Thebes, 1 Hhdyn. (4i)-2) with iScdment, 9th dyn. {37-9) 

(49-(l) with modern Grotans (.50-4) 

(48-3) witli Desbasheliand Medum, 4lh and 5th dyn. (46-0) 

-0-43 ±0-28 (12) 
2-01 + 0-30 (10) 
2-23 + 0-32 (9) 

4-04 + 0-60 
4-73 ±0-68 

Sadment, 9th dyn. (37-5) witli Oeshosheh and Medum, 4th and 5th dyn. (39-9) 
(37-7) with modern Gretans (47-9) 

- — — - - - 

1- 88 ±0-25 (14) 

2- 71 ±0-25 (15) 

4-86 ±0-65 
0-42 ±0-69 


* Tho numbers in brackets following the names of the series are the mean numbers of crania for the characters 
u.sed in computing the ooeftieients. The numbers in brackets following the crude coefficients are the numbers 
of characters on which they are based. Woo (19.30) gives coefficients with two of the series in the table above, 
and the values tliere differ from his because they were recalculated omitting the term 1 /?«. which was discarded 
after 1930. The standard deviations of tho long E series of 26th-30th dynasty crania from Gizeh (Davin & 
Pearson, 1924) were used in computing all tho coefficients in the table. 

characters. Occasionally, however, use of a smaller number of characters may suggest 
a rather misleading conclusion, and it will tend to indicate a rather wider separation of the 
series than that which would be found if all 31 characters could be used. With these reserva- 
tions in mind the coefficients with the new series may be accepted as the best approxima- 
tions it is possible to obtain in the circumstances. 

All the coefficients of racial likeness found with the Sakkara 1st dynasty and Thebes 
1 1th dynasty series are given in Table 3. Fig. 1 is a reproduction of part of a diagram given 
by Risdon (1939, p, 137) for the twenty-two ancient Egyptian and related series with which 
he dealt, with the addition of the two series described in the present paper and that of 
modern Cretans. The Sakkara series is seen to be an unexceptional member of the ‘Lower 
constellation, having two insignificant coefficients and other close connexions 

* Tho characters common to all the ooinparisons with the new series are L, II, II', LB, J, NB, 100 BIL, 
100 H'jL and lOOB/H'. Others used in some cases are B', U, S,fml, fmb and 100 fmblfml, and for the coefficient 
between the two new series only O'H, Nil, L, 0^', O, 100 G'HjQB, 100 NBjNH, L, iOO 



25 


A. Bateawi and G. M, Mobant 

with members of that group. On the other hand, the 11th dynasty soldiers from Thebes 
clearly represent an Egyptian population of an aberrant type. The direct comparison fails 
to distinguish this from that of the 9th dynasty series from Sedment. Woo (1930) had 
found that the latter stands apart from all the other Egyptian series, and he pointed out 
that the Sedment bears a closer resemblance to a series of modern Cretans (von Luschan, 
1913) than to most of the ancient Egyptian series. The Thebes 11th dynasty series has 
a reduced coefficient less than 5-0 with the Cretan, though the latter has no other coefficient 
of this order with any of the other Egyptian series. 

6. The, racial history of ancient Egyptian populations. The new evidence makes rather 
■ more precise the racial classification of ancient Egyptian populations given in earlier 
craniological papers in Biornetrika. The 1st dynasty series of crania from Sakkara is the 
earliest in date that has been described representing the region immediately south of the 
Delta. It is an uuexceptioiial representative of that group which must have prevailed 

InaignifWunl 

Significant anti < 3'5 — — — — — 

3'5-5*0 


Connexions 
with ‘upper 
Egyptian' 
serins 
(early 
predynaslic 
lo I8th dyn.) 


::: 


. Luohwh-^_ 


Ahydos 
Royal Tombt 
(ht dyn.)* 


' I 



Dcshasheli 
& Mrdum *, ^ 


• * * •V>.*'Dentl(;rah — •, 
(^tb-20lh dyn.) 


Denderah 


Thebes 
(lUh.dyn.) 

, , Sedment 
(9th dyn.) 


* *(Roman) 


\ / 

; / 

Thebes . , ^ ^ 
Tigre * ^(l'8lh~21st dyn.) 

Abyssinian ^p,*-**** ** 

(Modern) 


* Gizch 
’ ’(26th-S0lh dyn.) 

. . . .Ahydos 
.(18th-19ih dyn.) 


Cretans 
' (Modern) 


Fig. 1. Reduced coefficients of racial likeness betwocn the two now and other ancient 
Egyiitian and related aeries of male crania. 


in the region — with only slight local and secular variants — from the 1st to the 30th 
dynasty and probably in both earlier and later periods as well. Such populations are said 
to be of ‘ Lower Egyptian ’ type. 

In the region to the south, round Thebes and Abydos, the population was of a second 
racial type from the earliest predynastic (Badari) epoch for which there is any adequate 
craniological evidence. This is called the ‘Upper Egyptian’, though it would be better 
to call it Southern Egyptian. The population became modified slowly down to some 
time about the 18th dynasty. The change was such that the ‘Upper Egyptian’ type 
of population came to bear a closer and closer resemblance to the ‘Lower Egyptian’, 
■though the two groiips remained clearly distinct. About the 18th dynasty there must have 
been a fairly rapid, if not abrupt, change in the racial composition of the population of the 
Thebes and Abydos region. Nearly all the series from there, of that and later dates, are not 
of ‘Upper’ but. of ‘Lower Egyptian’ type. They diverge slightly from the populations of 
the region immediately south of the Delta, however, in the direction of the ‘Upper 
Egyptian’ type. Six of these series — ^viz. those from Abydos, Thebes and Denderah of 




26 Egyptian skulls 

dates ranging from the 18th dynaF?ty to Roman times—are, shown in Fig. 1. The ist dynasty 
series from royal tombs at Abydos, also shown there, is an exception on account of its date. 
The obvious explanation of its peculiar position is that it represents an intrusive and more 
or less isolated community which was derived from the other centre of population to the 
north. 

This accounts for twenty-two of the twenty -four series of crania considered. The classi- 
fication of these does not seem to necessitate reference to any non-Egyptian iieoples. 
This is not so,. however, in the case of the remaining two series, viz. the new one of llth 
dynasty soldiers from Thebes and the 9th dynasty series from Bedment (Deltaic region). 
The.se two might represent the same population as far as can be seen from the direct 
comparison, and both stand apart from the ‘ Lower Egyptian ’ constellation of serie.s (sec 
Fig. 1). The fact that the llth dynasty series from Thebes has a clo.se resemblance to one 
of Cretans, which is of modern date, suggests that the two aberrant communities in question 
may have been derived from the crossing of ancient Egyptians with jicople from some 
European or Asiatic source. 

The mean basio-breginatic heights (//'). cephalic indices and height-length indices are 
higher for the 1 1 th dynasty Thebes and Bedment series than for any other of the series 
considered. Tlie typos, defined by average measurements, of these two thus diverge from 
that prevailing in ancient Egypt in the direction of the ‘Armenoid’ tyin;. hllliot Hmith 
(1911 and elsewhere) supposed that intrusive 'Armenoid’ aliens played a c'onsideral)le 
part in modifying the. population of the country and that ‘long before' the time? of the 
New Empire, Egypt was permeated from one end to the other with tliis foreign elcmt?nt’. 

Our interpretation of the evidence fails entirely to supiiort this hypothesis. There is no 
need to suppose that any people foreign to the country played a suV)stantial part in 
modifying its population from predynastic to Roman times. The communities represented 
by the llth dynasty Thebes and Sedment series may possibly have been derived from the 
crossing of Egyptian and ‘Armenoid’ people, but they stand apart. The rc?markable ])oint 
is not that two out of twenty-four populations should be peculiar in that way, but that the 
remaining twenty-two show interrelationships which do not suggest any admixture with 
alien stock. They can readily be explained on the supposition that thc?re was a steady 
transference of population from the Deltaic region to the region of Thebes and Abydos, 
where the population was originally of a somewhat different type, from early jiredynastic 
times to the 18th dynasty. About that time the movement must have been accelerated, 
and thereafter the populations of the two centres were almost indistinguishable in racial 
type. The racial history of ancient Egypt was of a simple kitid. 

7. Summary and conclusions. This paper deals with forty-four male crania of 1st dynasty 
date from Sakkara and with fifty -five crania of llth dynasty soldiers from Thebes. Indi- 
vidual measurements taken by Prof. D. E. Derry are given in appended tables. Judging 
from the rather small samples, the two populations represented exhibited the same order 
of variation, while both were rather less mixed in racial composition than the population 
of Giza from the 26th-30th dynasties. Mean measurements clearly differentiate the two 
new series from one another. Judging from characters considered singly, both series bear 
a close resemblance to some other ancient Egyptian series, and both are of ‘ Lower ’ rather 
than ‘Upper Egyptian’ type. Comparisons are made by the method of the coefficient of 
racial likeness, though decidedly fewer characters than the standard set of thirty-one used 
when possible are available for the purpose. The resulting relationships are shown in Fig. 1. 





1 

Grave Ij ^fl r~ 

no. Irw^. 



fmb 

Bemarks 

86'6 

787 

About 1 8 years, Metoplo 

8i'3 

79'2 

Skull feipale? but pelvis definitely male 

797 • 

— 

— 

80-3 

89-9 

Boot of nose depressed. Slight hydrocephaly! 

78'0 

82-9 

— 

94'4 

87-5 

About 16 years 

— 

8i'8 


— . 

78-4 

— 

— 

93-1 

— 

80'0 

88-9 

— 

85‘0 

77-5 


82-3 

92-6 

■ 

SS'i 

go'o 

79-5 

84*1 

About 19 years. Skull female? but pelvis definitely male 

8i'6 

93-5 

Distorted by grave pressure 

1 79’0 

82-1 


79-0 

88-6 

About 18 years 

86'8 

8o-6 

— 

■— 

— 

Metopio 

0 

CO 

73.6 

■ — 

P— 

About 17 years 


7I'8 

— 

— 

897 


— 

— 

About 18 years 

79'3 

90-8 


82'9 

87-5 

— 

8l'8' 

81’4 

About 18 years 

85-8 

8 i -5 

797 

About 20 years 

About 20 years 

88-9 

Metopio. Distorted 

827 

77-5 

~ 

80-5 

82-4 

Old 

88-2 

83-8 

— 

87'2 



About 1 7 years. Negroid 

85-5 

78-4 






3 

4 

7 

8 
10 
M 

16 

17 
iB 
21 ^ 

23 i 
27 I 

30 i 
32 1 
38 i 

49 1 

50 ' 

56 I 

58 1 

59 ' 
62 ! 
65 
72 

77 : 

82 ; 

87 

88 
go 
gi 
92 
08 
99 
121 , 
122 
124 
128 

141 
145 
151 
175 

1 80, 
184 
T 90 ‘ 
226 


467 

50-4 

44*6 

52‘2 

45'2 

53-2 


53’9 

50'4 

46'g 

53’ 7 

52' I 

53' 2 
50'5 

49-1 

53-2 

44-3 

407 

39-5 


52-6 

49*5 

54-6 

52-0 

42'2 

5 ?! 

57’5 

5 i '4 

46- 4 

47 - 2 
47 ' I 


Derry’a niothod of taking tho moasurernent does not aooord with biometric practice. 


LPlBSS FROM ThbBBS* 


w 

100^ 

' B 

r G'Hl 

r .I 


Remarks 

76-5 

lOO-O 

70-7 

46-6 

80-5 


79-0 

I02'6 

77-4 

46-9 

85-5 

— 

74-1 

98-5 


39-5 

94’9 


77-2 

76-6 

98-2 

97.5 

68-7 

78-4 

6o-o 

43-4 

76-2 

86-8 

Metopio 

Metopio 

74-6 

957 

73-5 

51-0 

90-9 


74-2 

102-2 

76-7 

46-7 

90-2? 


72'5 

- 96-5 

71-1 

49-5 

80-0 


717 

104-1? 

81-5? 

46-2 




79-1 

98-6 

90-5 

Left parietal fractured 

^9•l 

88-9 

82-8 

45*5 

84-3 

8o’6 

94-3 

74-7 

53-1 

73 -* 


72-2 

ios -7 


46-8 

94-9 


70-8 

106-7 

71-9 

48-1 

75-6 


74-1 

98-5 

76-7 


7^-3 

— 

75'2 

97.8 

66-8 


84-2 

— 

75 'i 

102-2 

71-7 


82-1 

— 

69-7 

tio-8 

75-8 

6o-o 

88-2 

Metopio 

817 

98-6 

88-3 

40'0l 

85-9 


74'5 

104-0 

70-6 

52-4 

83-3 


69'9 

102-9 

75'7 

51-0 

79-0 


7 .V 9 

102-5 


48-0 

84-2? 

Fractures in parietal, frontal and orbital regions 

80'9 

98-3 

80-4 

42-6 

89-0 

— 

78-4 

93-2 

77.9 

46-2 

92*0 

Metopio 

74 - 1 

I 0 I -8 

74-5 

43-4 

89-7 

75-5 

101-4 

78-7 

43-4 

82-5 

Metopio 

— 

7 " 

82-5 

46-2 

86-7 

Metopio 

77‘5 

ioo'4 

69-5? 

— 

— 

Metopio 

77-2 

94-2 

79-5 

47-4 

93-8 

- — 

767 

107-6 

80- 1 

53-1 

87-0 

Metopio 

73-9 

104-5 

77-5 

45-3 

86-4 

Metopio 

“*• 

— 



— 

Metopio 

73-5 

104-9 

— 

51-0 

8o-o 

— 

73-4 

103-2 

64-8 

51-9 

83-3 


69 '3 

107-6 

75-7 

40-7 

81-3 

— 

73’4 

96-8 

73-9 

51-0 

84-4 

— 

787 

96-4 

79-0 

49‘5 

79'5 

— 

72'5 

roo'4 

72-0 

60-0 

81-6 

— 

77 ’i 

105*2 

74-2 

49-0 

87-8 

— 

78-0 

99.3 

80-1 

43-1 

88-5 


72*0 

103-8 

— 


83-1 

— 

807 

95-4 

78-3 

49-0 

78.4 


74'7 

I0I-5 

78-4 

44‘7 

79-8 

— 

7I‘8 

1057 


50-0 

78-6 

— 

1 

79 ' I 

97-1 

72-2 

'48-5 

84-3 

Metopio 


717 

109-1 



— 

Metopio 


73'2 

99-3 

— 

v™». 

— 

Metopio 


78-3 

— 

78-2 

50-5 

85-4 

Metopio 


97.9 


— 

92-1 



75‘5 

103-6 

“ 

— 

— 

— 


76-2 

99-31 

— 






78-5 

102'2 




— 


78-0 

lOO-O 

i>.M 

— 

— 

— 


73-8 

103-0 


— 




T else that Prof. Betty’s method of taking the measurement does not accord with biometric practice. 
:oeed fifty-nine. 
















A. Batrawi and G. M. Morant 27 

The iSakkara 1st dynasty series, which is the earliest from the region immediately south 
of the Delta, is an unexceptional member of the ‘Lower Egyptian’ constellation, and it can 
be supposed to typify the population of Northern Egypt at the time. The 11th dynasty 
series of soldiers from Thebes is linked to the same group, but it diverges from it. The 
type is indistinguishable from that of a 9th dynasty series, from Sedment. The former also 
has a link with the type of a series of modern Cretans. The two aberrant communities of 
Thebes and tSedment must bo supposed to have been derived from the crossing of ancient 
Egyptians with people from some European or Asiatic source. Our knowledge of the racial 
history of ancient Egypt derived from craniological evidence is reviewed. 


REFERENCES 

Davin, a. a. & Pearson, K. (1924). On the biometric constants of the human skull. Biomelrika, 16, 328-63. 
DuC'KWORtii, W. L, H. (1913). Internaiional AtjreemcnUi for lha Unification (a) of Graniometric and GephalomRlric 
Meamrcmenk, {h) of Anthropomelrk Mmo^rements to be. made on the Living Subject. Camb. Univ. Press. 
LtJHtuiAN, F. V(JN (1913). Bcitriifie Kur Anthropologie von Kreta. Z. Eikn. 14, 307-93. 

MaOraMaU.au, R. (1940). Un OinidUrc Archaique dc hi Classe Moyenne a Saqquara. Imp. Nat. le Caire. 
Morant, C. M. (192.0). A study of Egyptian craniology from prehistoric to Roman times, Biometrika, 17, 1-62. 
Morant, (I. M (193,0). A study of predynastic Egyptian skulls from Badari based on measurements taken by 
Miss B. N. litoBS8ig(U' and Prof. D, E. Derry. Biomelrika, 27, 293-309. 

Risdon, I). L. (1939). A study of the cranial and other human remains from Palestine excavated at Tell Duweir 
(Lachish) by the. Wcllcomi.'-Marstou archaeological research expedition. Biometrika, 31, 99-166. 

BerOI, B. (1912). Urania Habeiatinica. Oontribulo all' Antropologia dell’ Africa Orieniak. Rome: Loesoher. 
Smith, (1. Eu.iot (1911). The .‘Uicknl Egyptians and their Influence upon the Civilization of Europe. Harper. 
.Stob.ssioer, B. N. (1927). A study of the Badarian crania recently excavated by the British School of 
Arc'ha<!ology in Egypt. Biometrika, 19, 110-60. 

Thomson, A. & RANi)Ar.Tj-MA(dvnK, D. (1906). The Ancient Races of the Thebaid, Oxford Univ. Press. 
WiNi.ooK, H. E. (1928). 'I'lic Egy[)tian Expedition, 1925-27. Bull. Met. Mas. Art, New York. 

Woo, T. L. (1930). A study of soveiity-one ninth dynasty Egyptian skulls from Sedment. Biometrika, 22, 66-93. 


Appendix 1. Definitions of measurements 

Individual raeasuremonta of the two series of crania are given in Appendices II-III. The 
contractions used there and in tables in the text to denote characters are : 

L = maximum glabella-occipital length. B = maximum horizontal breadth. B' = mini- 
mum frontal breadth. H' - basio-bregmatic height. Aur. lit. - ‘vertical height from line 
joining highest points of external auditory meatuses ’. LB = basion to nasion. U = maxi- 
mum horizontal ciroumference above the superciliary ridges. = arc nasion to bregma. 

- are bregma to lambda. = arc lambda to opisthion. S - total sagittal arc from 
nasion to opisthion. Broca’s Q' = transverse arc from ‘the most prominent point on the 
posterior root of the left zygoma, exactly above the auditory aperture’, to the same point 
on the right passing through the bregma, fml = basion to opisthion. fmb = maximum 
breadth of foramen magnum. O'H = nasion to alveolar point. OB = facial breadth between 
lowest points on zygomatico-maxillary sutures. J = maximum breadth between zygomatic 
arches. NH, L = nasal height from nasion to point furthest removed from it on the 
margin of the left pyriform aperture. NB = maximum breadth of the pyriform aperture. 
0[ = breadth of right orbit from the dacryon. = maximum height of right orbit. 
ProstUon GL = basion to prosthion. 



[ 28 ] 


THE GENERALIZATION OF LSTUOENT’S’ PROBLEM 
WHEN SEVERAL DIFFERENT POPULATION 
VARIANCES ARE INVOLVED 


By B. L. WELCH, B.A., Ph.D. 


1 . Introduction and summary. Let be a population parameter which in eHtimatcd by an 

k 

observed quantity y, normally distributed with variance rr*. Let fr“ = v A,(rf, where the 

i-.i 

are known positive numbers and the (t\ are unknown variances. Huppose that the observed 
data provide estimates sf of these variances, based on degrees of freedom, respectively, 
so that the sampling distribution of sf is 


f{a\)da\ 



( 1 ) 


and that these estimates are distributed independently of each other and of y. 

A very simple particular case of this set-up occurs when we have sample.s of and 
respectively, from two normal populations with true means ami and standard devia- 
tions cTi and (Tj. If 1 ? is the true difference (aj — a.^ between the incans, the estimated difference 
is y = The variance of the estimate is o-* = (AiO-f -i- AaO-l), where Aj » l/nj and 

Aj = Ijn^. The estimated values of erf and cr| are sf ■= A’j//j and sf - where and Aj 
are the respective sums of squares of observations from the individual sample means and 
/j == (%-l) and /a = (nu— 1). These s^ are distributed in the form (1) and the postulated 
conditions of independence hold. 

Another particular case, again with k = 2, arises when we wish to compare two regression 
coefficients, fitted to independent seta of data, without making the assumption that the 
population residual variance about tlie true regression line is the' same for both .sets. 

The present paper is written mainly with these practical apptication.s of the case k = 2 
in mind, but the results are expressed generally for any k, since no further analytical diffi- 
culties are involved. It will be shown how i)robability statements about y, considered as 
an estimate of y, may be made similar in character to those which W. .S. Gosset derived for 
the mean of a single sample of n observations (‘Student’, lOOH). We shall, in effect, seek a 
quantity k, calculable from the ob.servations, with the property that tlie cthancc of the 
difference (y-y) falling short of /t is a given probability P. It is clear that A must be. a 
function of the individual variances s'f and of P. If the abbreviation Pr. i.s used to mean 
‘the, probability of the relation in the bracket following’, our pn)blcm i.s to satisfy the 
equation 


Pr. [ (y - >/) < /t(sl 4 , . . . , . 4 , P)] = P. 


( 2 ) 


In Gosset’s case the solution was, of course, simiily 


Pr. [(ar-a) < tps/^ln] = P, (3) 

where fp is the value, corresponding to the probability level P. in the ‘Stinlont ' i-distribution 
with / = (n- 1) degrees of freedom. 



B. L. Welch 


29 


In the next section the mathematical derivation of the exact solution of (2) is given. This 
is then followed by some consideration of its expression in numerical terms. First, a series 
solution in powers of 1 (f^ is developed, which may be used for calculating tables. Then some 
comparisons are made with a non-series approximate solution which is based on a particular 
way of regarding the distribution of a quantity of the general form z = 

Some brief discussion is then added which may serve to place the present contribution 
in its proper relationship to other papers which have been written on this topic. 

Finally, it i.s shown how the inequality (2) may be adapted to provide an interval estimate 
for 7j. 


2. Mathematical derivation of solution, luei j{a\, a\, P) denote the probability that 
{y — ri) is Ie.ss than A(sf,5|, ...,5|, P), given si (i = 1,2, ...,Z;). Then, since y is distributed quite 
independently of the estimated variances, we have 


j(sf,«|, ...,vS|,P) 


J w-.—co 


1 


e-i"' du 


P) \ 


(4) 


where / is used to denote the normal probability integral. The condition of equation (2) is 
then simply that, if ...,s\,P) is averaged over the probability distributions of sf as 

given by (1), the result will equal P. Thus 


I. 




(S) 


Now we may expand j{sl,sl, ...,8l,P) about an origin (cr|,cr| cr^) in a Taylor expansion. 

Thus 

j(6'f,a|, ...,4,P) = exp[.r(5|-£r|)a.j]j(Wi,w;2,...,t<;j,,P), (6) 


it being understood that the exponential is to be expanded in a power series in 9,, and that 
9,. is to be interpreted so that 

(wh- '"a- • ' • ' w-v, P) = Wg, . . . , Wfr, P) J 


jxvj-a-j' 


where 


On making the sub.stitution of (6) into (5) our result may be written 

...,W)ft,P) = P, 

0 = n|exp[(«f-frf)9,.]p(s?)ds?. 

Now, substituting into (0) from (1 ), the integral comes out in simple form, i.e, 

0 = n *^'exp[-<7-19i] 


exp 




I CT^ 4 <7** ^ • (T‘ 

= expiN^^+x2;^ + 22:^ 




r8 9f 


■ -I- etc. 




(7) 


( 8 ) 

(9) 


( 10 ) 



30 A generalization of ^Student’s' problem 

Substituting (4) into (8) we have finally 


0/ 


= P. 


( 11 ) 


^(PAjirf) 

This, in a very condensed form, is the solution to our problem.* The operator © constitutes 
a direction to carry out the partial differentiations indicated by ( 10), Wj must then be equated 
to erf. Tlie solution of the resulting equation will give P) and therefore the 

required /i.(sf,Sa, P), 

3. The development of the series solution. It will be convenient to write h{w) for 
P) and ^ for the normal deviate such that /(g) = P. We may then expand 

/ [ in a Taylor series about £ as origin. Thus 

W(PA, .(tDI ® 

n_%(L_rl/)l 

jV(rA,o-f) ^1 J 




= exp 




( 12 ) 


WK-o-l) 

it being understood that the exponential is to be expanded in powers of P, and that these 
powers ate to be interpreted so that 

= (13) 


Equation (11) then becomes 




m =•• /(^). 


(14) 


Tliis may now be solved by successive approximations. 

The initial approximation is the large-sample normal ajjproximation 

h,{tv) = (15) 

and we may write 

h{w) - g ^(PAf w.j) -f hi{w] + h.i(w) -f etc. , (Hi) 

where hf^w) includes terms of order l//j, h.fp)) terms of order l[ff and so on. For the moment 
we shall treat terras of the order Ijjf as negligible. Then (14) gives 


i.e. 0 exp 


0exp 




PA 


1 4. T> l h\{tv)IP \ 

■"V(PA,,rf)-^lV(A^A,o'?)^2 PA.ir? j- J 


(17) 

/(n) = /(g). (18) 


Or, using (10), 
’h^{(T'^)D 






liv) 






AW lAK^ ^ / ( /PA,:«.v n 

i) + 2 PAjU-f /, ^)J 


V(PAiu-; 


V(PA;^ 


The equation of the first order term to zero gives 


V Afg^l \ 

g(i+g^)r .1):) 


:( 


(19) 


(20) 


* Equation (11) can also bo expressed as an integral equation and thi.s form may be necosaary for 
providing numerical values where the /, are very small. 



31 


B, L. Welch 


This can then be substituted in the second-order term which, when equated to zero, will 
give h^icr^)- The process may obviously be extended to higher orders, although the expressions 
become so complex that a slightly different procedure has then been found to be preferable. 
To terms of order 1//| our solution is 





(1 + g^) 



4 


(1 + g^) 

2 



+ 


(3+5p-tar n) 

(15H-32£2 + 9g^') 

, > 

to 

■■ 1 

3 (PA.sf)® 

32 

J 


( 21 ) 


It may be noted that in the particular case k = I, this reduces, as it should, to the already 
known expansion of the deviate of the straightforward ‘Student’ distribution (Fisher, 
1941, p. 161), viz. 


l-t- 




4/ 


96/2 


H-etc 




( 22 ) 


It is proposed in another communication to give tables of h(s'^) based on the expansion 
(21) carried to some further terms. 

4. Discussion of a non-series approximation. It will be recalled that in Gosset’s original 
approach to the single sample problem (‘Student’, 1908) his initial step was to note that the 
first four moments of the distribution of s® were consistent with, the assumption that the 
distribution could be represented by a Pearson Type III curve. He was fortunate in this 
way to rediscover a distribution which had already been found by Helmert, as this permitted 
him to go on to the derivation of the f-distribiition. In our present case, as in many others 
arising naturally in statistical work, we are led to consider, instead of a linear function 
PAfSi of several s'f. If this linear function were distributed in a Pearson Type III distribution 
a whole range of new problems could be dealt with by well-established theory. However, in 
general, we do not have this good fortune. For Z'AjSl is of the form xl, where — XiO■\lf^, 
and the distribution of this quantity is only of Type III if all the a^, except one, are zero, or 
if all the happen to be equal. 

Nevertheless, for practical purposes an approximation to the distribution of PA^sf, using 
a Type III curve with start, mean and variance suitably adjusted, can still be useful. In two 
previous papers (Welch, 1936, 1938) I have employed this method to obtain numerical com- 
parisons of the merits of different statistical procedures, where full calculations with the 
true distributions would have been unduly laborious. The method of determining the con- 
stants in the approximation was given for the case ifc = 2 in the first of these papers and is 
as follows. 

If 2 = (aXi + ^All)> approximate distribution curve is written in the form 


then making the first two moments of (23) agree with the true moments of z, we find 




W1 + V2? 


aYx + b%' ^ + 

Phrasing the matter rather differently, we can say that zjg, is approximately distributed as 


«'^fi + b% 


(24) 



32 A generalization of 'Students' 'problem 

with degrees of freedom/. Of course /, given by (24), will in general be fractional, but the 
letter used to designate this quantity was chosen, and the term ‘ effective degrees of freedom ’ 
has been used, because by doing so we can appeal immediately to a considerable body of 
further theoretical results. 

In particular we can say that the criterion 

follows approximately the ‘Student’ ^-di8tribution with degrees of freedom 

(Aitrt + Aacrl)^ 

^ (26) 
fi fi 

More generally, when k is not restricted to 2, the same line of argument leads us to say 
that the criterion _ -tj) 

'’ = 7(5a~^) 

is approximately distributed as ‘Student’.s’ t with degrees of freedom 

. (i^A,rrf)^ 

■ ■ (28) 
fi 

Not knowing the n/s in (28), there are several ways in which we may now proceed, depending 
on what weight we may bo willing to attach to any vague a priori notions we may possess 
of their relative magnitudes (cf. Welch, 1938). If we are not willing to assume anything, 
perhaps the beat choice is 


It may be shown that the numerator of (29) has, in repeated samples, an average value 
and the denominator has average value i^Afoi//;. In a certain sense, therefore, 
(29) is a fair estimate of (28). 

To sum up, then, the interpretation of y as an estimate of y, using the present type of 
approximation involves only the reference of the criterion (27) to tables of the ‘Student’ 
distribution, entered with degrees of freedom given by (29). 

Some further light is nbw thrown on this procedure by the expansion for the exact solution 
of our problem derived in the preceding section. For the implications of referring v to the 
‘Student’ distribution may be seen by substituting / from (29) into the expansion (22) of 
the ‘Student’ deviate. On doing this and then expanding in powers of 1//; it is found that, 
in effect, our approximation corresponds to assuming that 



whereas, in fact, the true solution is given by (21). Comparison shows that we have exact 



B. L. Welch 33 

agreement to terms of order 1//^ and in the first of the quadratic terms. To the second order 
the difference between the expressions in square brackets in equations (21) and (30) is 


3 l(i:A,«|)3 (i;A,5|)*J’ 


This difference vanishes if any one of the s| is overwhelmingly larger than all the others, or 
if is proportional to/JA^. It appears that, in general, the difference is not likely to be large. 
We have, therefore, found some justification for using the Type III approximation in the 
present case. 

The above comparison has been made on the basis of the series developments, but it should 
be borne in mind that approximations based.on positive frequency functions, such as those 
falling under the Pearson system, usually provide a higher degree of accuracy than might 
appear from any consideration of expansions. Furthermore, they are apt to give an insight 
into the nature of the situation which may sometimes be lost in working out the details of 
exact solutions. In the present case I feel that the comparison of this section serves to give 
added confidence in the exact solution,* which I have put forward in the previous two 
sections, quite as much as it demonstrates the value of the approximate method. 


6. Further discussion. In comparing the present contribution with other work on the 
subject, the essential point to notice is the averaging process involved in equation (6). We 
are not trying here to make probability statements valid fov fixed s\, but are averaging over 
the joint probability distribution of the ,s|, taking into account, therefore, the different 
values which can arise by chance in sampling from populations with fixed erf. 

This averaging over the joint distribution of the is parallel to the step taken in 
Section III of Gosset’s original memoir (1908) where, in effect, he starts with the distribution 
of t for samples with/a;eiZ s and then averages over the distribution of s which he has already 
derived earlier. He thus arrives at the unrestrieted distribution of t (or, more strictly, of a 
quantity z, which is equal to t multiplied by a constant). This distribution forms the basis 
of the significance tests which he illustrates in his Section IX and of the method of deriving 
interval estimates for the population mean which he outlines in his Section VIII. 

In the present paper the parallelism with Gosset’s work may be obscured to some extent 
by the fact that we do not from the outset seek the probability distribution of some pivotal 
quantity like t, explicitly expressed. It so happens that we are able to proceed to a method 
of deriving an expansion for the required probability level without making explicit reference 
to such a quantity. Nevertheless there remains the important resemblance with Gosset’s 
development, in that we do not confine ourselves to samples with fixed s\. 

This procedure stands in sharp contrast to the formulation of the problem of comparing 
two means, favoured by R. A. Fisher (e.g. 1941) and H. Jeffreys (1940). These writers 
prefer a solution which they ascribe initially to W. U. Behrens (1929). Looked at from one 
point of view, Behrens’s paper appears to contain some gross algebraical errors. Fisher and 
Jeffreys, however, develop lines of argument by means of which they claim that Behrens’s 
solution is quite justified. It seems to me difficult to say how far (if at all) any of these 
arguments may have been in Behrens’s mind when he wrote his paper and I shall not 
attempt to elucidate this question here. We may, however, permit ourselves one observation 
about the developments according to Fisher and Jeffreys. 

* Exact in the sense that it is independent of the irielevant population parameters iTi®. 

Biometrika 34 


3 



34 


A generalizatian of ‘ Sltudent's ’ problem 

Both these writers, at some stage, limit the field of their probability inferences to a sub- 
set in which the si are regarded sm fixed. In order to solve the problem on these lines J effreys 
introduces an a priori distribution function for the unknown cr,, following his general 
philosophy for dealing with such questions. Fisher, on the other hand, arrives at the same 
answer by a special utilization of what he terms the fiducial distribution of cr^. 

Jeffreys’s approach here does not raise any new issues to those who are familiar with the 
general body of his researches on statistical inference. Fisher’s justification of Behrens’s 
solution is perhaps of more immediate interest as it raises controversial points which are 
important more specifically in relation to our present topic of discussion. For although 
Fisher’s approach has been very much criticized by a number of writers, starting with 
M. S. Bartlett (1936), the critics have not wished to throw doubt on the whole body of 
results which Fisher includes under the heading of fiducial inference. The criticism has been 
for the most part selective, directed mainly at the way in which so-called simulkmeous 
fiducial distributions of several parameters have been defined and manipulated. 

I have, myself, quite definite views on these questions (particularly on the usage of the 
word ‘fiducial’) but do not feel that I need express them at any great length here. I dis- 
agree with Fisher, but this divergence of opinion must already have become apparent in 
the way I have defined the field within which I make my probability inferences about 7]. 
It appears to me to be quite artificial to restrict our view to one which, even in a limited 
sense, fixes si. It is true that, in the two-sample problem, we have to draw our inferences 
from the unique pair of samples observed, or, more precisely, from the statistics x^, x^, 
and 8| which they provide. These statistics are our only data for the purpose of making 
inferences, but we add something to these data in the interpretation when we regard the 
samples as being drawn randomly from hypothetical normal populations. Once having 
embarked on this method of interpretation, we should stick to it consistently throughout. 
The sampling variations of sf should be taken into account only by a direct use of the 
probability distributions as given by our equation (1) and not by any inversion such as is 
involved in Fisher’s conception of the fiducial distribution of fr|. As we have seen, it is 
quite possible to make probability statements about the difference between the population 
means without making any reference whatever either to inverse probability or to fiducial 
distributions. 

The distinction l^etween the procedure which Fisher advocates and one which averages 
over the distributions has, of course, been stressed by most of the writers who have 
contributed papers on the subject, from whatever viewpoint (e.g. Bartlett, 1936, p. 560, 
and Yates, 1939.) What has been lacking hitherto, however, is a solution, analogous to 
Gosset’s single sample solution, which makes complete use of the information contained 
in the data provided. Bartlett indicated one particular way in which probability inferences 
about the difference between two population means might be made, but was careful to point 
out that the problem of making the best possible inferences (in the theoretical sense of 
utilizing all the information in the data to its full extent) was still an open one. There has 
indeed been some doubt expressed whether a fully satisfactory solution from this point of 
view existed at all. I believe, however, that the one I advance above in equation (11), 
and’develop in equation (21), meets all the requiremerits that one can reasonably expect. 

Whatever conclusion the reader may come to on these matters, however, he will probably 
wish to know how, in the numerical details, this solution will differ from that of Behrens. 
This will be more easily seen when some tables become available, but fortunately certain 



B. L. Wei-oh 


36 


comparisons can already be made. For Fisher (1941, p. 156) has provided a series ex- 
pansion of the Behrens solution. In oui' notation, and with fc = 2, this may be written, to 
order 1//^, as follows: 


h(s^) = 


4 M + fj (AiSf + AaSi)*^ 


(32) 


Even to this order, this differs from our equation (21) in the inclusion of an extra term. In 
other words, although the two solutions are the same when samples are large enough to 
adopt the large-sample normal approximation, they differ immediately we take into account 
the first corrective term, i.e. they differ as soon as we begin to attach any importance to 
‘ Studentization ’. 


6. An interval estimate for rj. We have shown in §§2 and 3 how to calculate a value 
h(sl, s|, . . . , P), depending on the observed variances s|, s|, . . . , such that the probability 
is P that {‘y — 7j)< h{sl, s|, . . . , , P). This provides a method of testing the consistency of an 

observed y with a prescribed value rj. 

When the question is not whether any particular given tj is contradicted by the data, but 
rather one of estimating rj and at the same time of providing a measure of the uncertainty 
of the estimate, the further step required is immediate. For, as in the ease of a single sample, 
the order of the words in our probability statement can be changed so that it becomes — the 
probability is P that ij is greater than {y—h(sl, s|, s|, P)}. An interval estimate for rj is then 
obtained by taking two levels P^ and P^ for P. Thus the probability is (Pj — Pg) that rj lies 
between {y - h{s\, si,..., P^)} and {y - h{s\, si,..., si P^)}. 

If Pg = (1-Pi) the range will be symmetrically placed about y. Thus, .for example, if 
= 0-96 and Pg = 0*05, the chance will be 90 % that rj lies within the range 


y± 1-6449 V(rA,sf) 


^ ^ 1-f (1-6449)^ 
4 





•4 etc. 


(33) 


It may be noted, incidentally, that this range is always narrower than similar ranges 
calculated from Behrens’s solution. 


REFERENCES 

Bartlett, M. S. (1936). The information available in small samples. Proc. Oamb. Phil. Soc. 32, 660-6. 

Behrens, W. U. (1929). Ein Beitrag zur Fehlerbereehnung bei wenigen Beobachtungen. Landw. Jb. 
68, 807-37. 

Fisher, R. A. (1941). The asymptotic approach to Behrens’s integral, with further tables for the d test 
of significance. Ann. Eugen., Land., 11, 141-72. 

Jeffreys, H. (1940). Note on the Behrens-Fisher formula. Ann. Eugen., Land., 10 , 48-51. 

‘Student’ (1908). The probable error of a mean. Biometriha, 5, 1-25. 

Welch, B. L. (1936). Specification of rules for rejecting too variable a product, with particular refer- 
ence to an electric lamp problem. J. Roy. Statist. Soc. Suppl. 3, 29-48. 

Welch, B. L. (1938). The significance of the difference between two means when the population 
variances are imequal. Biometriha, 29, 350-62. 

Yates, F. (1939). An apparent inconsistency arising from tests of significance based on fiducial dis- 
tributions of unknown parameters. Proc. Camb. Phil. Soc. 35 , 679-91. 


3-2 



[ 36 ] 


THE DISTRIBUTION OF KENDALL’S t COEFFICIENT OF RANK 
CORRELATION IN RANKINGS CONTAINING TIES 

By 6. P. SILLITTO, Ph.D., B.So., Research Department, I.OJ. {Explosives Ltd.) 

A new coefficient of rank correlation has been described by Kendall (1938, 1942, 1943) and 
denoted by him as r. This coefficient has advantages over Spearman’s p in respect of the 
smoothness of its distribution and the rapidity with which it approaches normality, thus 
facilitating significance testing, and in being readily adapted to cases of partial rank 
correlation. 

The distribution of t has been worked out by Kendall (1938, 1943) for cases in which 
neither ranking contains members which are graded equal, i.e. rankings containing no 
‘ties’. It is the purpose of the present paper to deal with cases, which frequently arise in 
practice, in which ties occur in one of the two rankings. The method is a generalization of 
that of Kendall and will be given in some detail for the case of tied pairs, while the results 
of further generalization to multiplet ties will be indicated without detailed proof, which can 
in all cases be effected simply on the lines indicated. 

Definition of t for rankings containing ties 
In counting the ‘score’ of a pair of rankings, by the methods suggested by Kendall, each 
member is compared with the other members of the same ranking, and additions to or sub- 
tractions from the score are made depending on whether it is smaller or greater in each case. 
If some members are ranked equal then it is proposed that no change be made in the score 
in comparing them. This obviously accords with the intuitive aspects of ranking. Thus in 
the pair of rankings following, the score is + 8: 

1 2 3 4 6 6 

2 1 3 6 6 3 

The maximum score possible is thus obviously reduced by the presence of ties, and it is 
evident that the presence of each tied pair reduces the maximum possible score by unity, 
so that it becomes \n{n~-l)—p^ for the case of a ranking of n members containing p^ pairs. 
Thus for such a ranking t would be defined as 

T = 2Sl{n{n-l)-2p^, 

where S is the observed score. 

Generally, each r-tuplet tie reduces the maximum possible score by ^r{r~l)* so that for 
a ranking of n members containing p^ pairs, p^ triplets, ...,Pr r-tupleta, 

W 

^ n{n-l)- 2 p^-(ip 3 -...-r{r-l)p,.' 

The sum op the frequencies of the possible scores 
When no ties are present, each permutation of the n members produces a possible score so 
that there are in all n ! possible scores. When ties are present they decrease the number of 
possible permutations of an assigned set of members, but, on the other hand, they give rise 

* This result has been given by KendaU (1946). 



Gr. P. SiLLITTO 37 

to further families of scores due to the different places in. the ranking which can he occupied 
by the tied members. Thus, for instance, the rankings 

113456, 122466, 123356, 123446, 123466 
all give rise to the maximum score, 14. 

Considering any assigned ranking, the number of possible permutations with pa pairs 
present is n !/2*'2, or if there are in addition pg triplets p, r-tuplets, «, !/(2 1)^2 (3 !)®a . . . (r 

The distribution of scores of an assigned set of ranks will be referred to as the basic dis- 
tribution for the type of ranking concerned, since consideration of the possible ways of 
assigning the pairs, triplets, etc., among the members of the ranking has only the effect 
of multiplying the frequency of each score by a constant factor. This factor is the number of 
ways of distributing the Pi^-p^i+Pz-^ ■■•+Pf ranks among the n members. This is the 
number of possible permutations 

{Px+P-j+Pz+.-.+PrY- 

Px'-Pi}Pz'--PA 

■Basic frequency distributions of the scores 

The basic frequency distributions can be established by an extension of the methods given 
by Kendall. Considering first the case of tied pairs the frequency function of the basic 
distribution of the scores may be written /(i?, 71,^2), where p^ is the number of pairs. The 
frequency generating function is then '^f(Sp n,p^) Now consider the addition of another 

tied pair, with a greater ranking than any of the existing ranks. If it is added to the extreme 
left of the ranking it adds — 2n to the score. Moving one of the pair one place to the right 
adds 2 to this new score; bringing the other added member up to it adds another 2. Starting 
again with both the new members on the extreme left, movement of one of them two places 
to the right adds 4 to the new score, bringing the other up to it in two steps each of one place, 
adds successively a further 2 and 4. Proceeding in this way all possible additions to the old 
score which may be brought about by the addition of a tied pair of new members are repre- 
sented by the array 

— 2n — (2 n— 2) —( 271. — 4) — (2 w— 6 ) ... 0 

— (271 — 4) —( 271 — 6 ) —(271 — 8) +2 

— ( 271 — 8 ) —( 271 — 10 ) 4-4 

-( 277 - 12 ) 

- 1-277 

Thus the addition of a new tied pair has the effect of multiplying the frequency generating 
function by 

-I- (i-(2n-2) + ^ ^-(271-8)) ^ ^ ((0 + ^2 _ .j. {271) 

The addition of a single new member to the ranking has the effect of multiplying the 
frequency generating function by 

{{-71 + {-(71-2) 4 . {-(71-4) + _ _ + {71 j 

as shown by Kendall, the presence of tied pairs in the existing ranking having no effect. 



38 


Coefficient of rank correlation in rankings containing ties 

With these two recurrence relations there is no difficulty in drawing up a table of basic 
frequency distributions for tied pairs as exemplified in Table 1, in which only positive values 
of the score are shown, negative values being obtainable by symmetry. 


Table 1. DMribidion of the score 8 for values of n from ‘A to 1, and for rankings containing 
Pj pairs of members ranked equal (only positive half of symmetrical distribution) 



Before the construction of the table has proceeded far, however, it becomes evident that 
there is a recurrence relation between individual frequencies for any given value of n, such 
that the frequency of any score 8^ for p^, pairs is the sum of the frequencies oi 8j~ I and 
Sj+l for i) z + 1 pairs. This obviously arises from the fact that if two members ranked equal, 
say rth, in a ranking with p^-i-l pairs are subsequently distinguished and given rankings r 
and r + 1 , this will increase the score by unity if the (r + 1 ) member falls after the (r ) member 
when the ranking is arrayed against another ranking in the natural order 1, 2, 3, and 
reduce it by unity if the other member of the pair becomes the (r -|- 1 )th; and these two possi- 
bilities complete the ways of forming a ranking with p^ pairs from one with pairs. 

This simple relationship, which may be written 

f(8p n, pf\ = /((S; -t- 1 , », pz -t 1) +f{8^ - 1, w, pz -P 1 ), 
or taking another way of writing the basic distribution function 

= 9^('S/+l.Pi-2,P2+l) + ?^('Sy-l,Pi-2,Pa-t-l). (1) 

Pi being the number of members not in tied pairs, is of great assistance in tabulating the 
frequency distribution, and will be used below to establish the formula for the variance of /Sf. 
It can be generalized to cover the effect of increasing the number of r-tuplets, when it becomes 

Pi, Pv; Pr~l, Pr)= f>(Sj--r - I, p^- 1, p^, ...,p^_i-l, p^ + \) 

-t- -r- 3, pi- 1, 2 ) 2 , . . . , - 1. p, 4- 1 ) 

-b 

+ •■■.Pr-l-l. 25r+l)- 


( 2 ) 


































G. P. SlLLITTO 


39 


PBEQUBNCy AND PROBABILITY DISTEIBITTIOJIS OF THE SCORE S 
Prom a table of basic frequency distributions such as Table 1 the construction of a table 
showing the probability of attaining or exceeding an observed value of /3 by chance from an 
uncorrelated pair of rankings can obviously be constructed, and Table 2 shows such pro- 
babilities (positive only, negative values obtainable by symmetry) for values of n up 
to 10, and all possible numbers of paired and.pg of triplet ties. 


The VARIANCE oe S 

The variance of 8 when ties are present can be readily derived by using the recurrence 
relations given above and the value given by Kendall for the case of no ties. For the case 
of tied pairs consider 

(;S+ + 1, 2, P2+ 1) + (^- 1, p, - 2, P2+ 1) 

= A'«{0(/S'+l,Pi-2,p2+l).+ ^(<S-l,Pi-2,p2+l)} 

+ 2{(;S+1)^4(S+1,Pi-2,p2+1)-(^1-1)0(S-1,Pi-2,p,+ 1)} 

— y>{8 + 1, Pi — 2, ^2+ 1)+<I)(S— 1, Pj — 2, P 2-1- 1)}. 

If now both sides of this equation are summed over all values of S, the terms on the left- 
hand side become ^ l 

Pi - 2, Pa + 1 ) . 

The first terms on the right-hand side become by virtue of the recurrence relation (1) 

j^va.T(f>(8,pi, Pah 

the second vanishes through the symmetry of the distribution, while the third becomes 

2 . n! 

■"(Yljiviu- 

Hence there is obtained 

var0(jS', pi-2,p2-f 1) = var^(A’, Pi, Pa)-!, 
and so var^(,Sf, n — 2p^,p^) = va,T(p{8,n)—p.^ 


«(n— 1) (2'ft-l- 5) 

- jg P2> 

using Kendall’s result. 

These results can also be generalized, using equation (2), to deal with multiplet ties, 
obtaining 

va,v^{8, Pi-1, ...,p,._i-l,p,.-|-l) = va,T^[8, Pi, ...,p,._i, Pr)-(r*-l)/3, 


and var0(,S,pi,p2, ...,p,.) 


n{n— 1) (2?i + 6) 
18 


-P2 


3-1-8 

-- y-Pa--. 


d + 8+... + {r^-l) 
S Pr 


It is obvious from these equations for multiplet ties that for any given number of ties of 
each multiplicity the variance will tend towards that of the system without any ties as n 
increases. 


Applioation to a practical case 

The following results were obtained in a practical case in which two different tests were 
carried out on one each of a set of products. The problem is to determine the degree of 
relationship between the results of the two tests. It is also an instance of an occurfence 



40 


Coeffimnt of iwik comkiion in rankinri-'i mnkiinhm tm 


which arises at times in i)raeti(;ca in wliich sonie of the results are ‘off the scale’ of measure- 
ment with respect to one of the tests; these, twelve in all, have heen given a tied ranking of 18, 


T.'st A 

4()'H(I 

41*7() 

3 O' 75 

,1 i *a5 

2!)-40 


25-*2() 

211*7,5 

28*45 

20*,S5 

2(i*35 

Test B 

1*5 

1'5 

I'5 

2*5 

3-5 


10 

2*5 

2*5 

2*5 

(1*5 

Teat A 

21'4() 

iti-or) 

1H‘!)5 

22*1)0 

22*«(l 


20'25 

24*45 

22*70 

20*50 


Test B 

>1(1 

>10 

>10 

> 10 

>10 


>10 

> 10 

>10 

> 10 


'Past A 

22'()0 

27-, 50 

23'7ri 

30-S0 

21*00 


27*10 

22*10 

li)*25 

25*45 

24*10 

'Past B 

>10 

:p5 

I*') 

2*5 

7 


0*5 

7*5 

0 

3*5 

>10 

Ranking the results according to their order in test 

/I (from highest values to lowest) there 

is obtained 

1 2 

3 

4 5 

t; 7 

8 

!) 

10 

11 12 

13 14 

15 



1 1 

5 

1 5 

10 5 

10 

13 

5 

5 IH 

13 10 

I8 



It) n 

18 

HI 20 

*> I ‘h> 

M 1 MtM 

23 

24 

25 

2(1 27 

28 2(1 




18 l.S 

I 

18 18 

18 1(1 

18 

IK 

15 

IK 18 

17 IH 




The lower ranking has 1 pair, 1 triplet, 1 (iimdruiilet, 1 (luintnplet and oni* 12-memher 

2!l >: 2K 

inultiplet. The niaxiimun possible score is - 1 -:)-»()*- in. (iij ;j2(i m such a 

. . 212 

ranking. The actual seoir i.s +212, giving r ” “ i.n+Kffbl. The variunee of the disfrihntion 

of the scoro.s obtained with such rankings in the ease of no eorrelation is 
21). 2H. (ill , 11 2(i no (i;iH ^ 

1.. -‘-r- 

Hence the probability of obtaining a .score of 212 or more iVom an nneor related pair of 

such rankings corresponds to tlu> probability of a normal variat e attaining or exceeding 

212 . , 

„ = 4*158 times its standard deA'iation. 


Kl'lFEKKNt’KS 

KEsn.'VLL, M. (1. (1938). A now inwwiirc of rank (rorreliitiitn. Iliiiiiu lrikii, 30, SI. 

Kendall, M. (I. ( 1942). Partial rank (*(H’i'(>laUi»n. 'ill. 

Kendall, M. (l, (1943). The Adi'anml flinirn uj 1, cliajiti'r 10, Itmik ('(irri-laliim. I.ini(I()ii: 

(t (Iriflin and On, t+d, 

Kendall, M, (J. ( 194.5), Tlai tri'idnieiit nl' tics in rankiuc Hitimtlrihi, 33, 




[ 41 ] 


THE USE OF RANGE IN PLACE OF STANDARD 
DEVIATION IN THE i-TEST 

By B. lord, B.So., Shirley Institute, Didsbury, Manchester 

CONTENTS 

(i) Introduction 

(ii) The t-test 

(iii) The modified test (u-test) based on range ....... 

(iv) Computation of percentage points of the distribution of m = M(l,n), i.e, case 

with m=l. 

(v) Computation of percentage points of the distribution of m = u{2,n), i.e. case 

with m = 2. . 

(vi) Computation of percentage points of the distribution of n = M(m, n) for m> 2 

(vii) Approximate values of the percentage points of m 

(viii) Applications of the w-test 

Appendix . On the independence of mean and some linear estimates of standard deviation 

in random samples from a normal poptdation ...... 

(i) Introdxiotion 

The difference between liigheat and lowest values has always been recognized as a general 
indication of the variability of quantitative data. It was not, however, until 1925 that 
attention began to be focused upon the range as a useful statistical tool. In his paper ‘On 
the extreme individuals and the range in samples taken from a normal population’, Tippett 
(1925) obtained an expression for the mean value of the range in repeated random samples, 
and computed its value in terms of the population standard deviation for samples of size 
n = 2 to n - 1000. He also gave numerical approximations to the values of the moments 
of the range for fairly large samples. 

The work was taken up by E. S. Pearson (1926), who determined numerically the exact 
values of the moments of the random sampling distribution of the range for small samples 
of size w < 6, and also approximations to their values for samples of medium size. In a 
subsequent paper B. S. Pearson (1932) tabulated the upper and lower percentage limits for 
the distribution of the range from frequency curves fitted with the values of the moment 
coefficients taken from both of the earlier papers cited above. 

The next advance was the determination of a general expression for the distribution of 
the range in samples of n random values from any population by McKay & Pearson (1933). 
For the normal population, only in the case of n = 3 was it found possible to obtain a fairly 
simple analytical form. (The distribution for » = 2 is, of course, well known, taking the 
form of the positive half of a normal curve.) 

Hartley (1942) later determined an expression for the probability integral of the range and, 
with Pearson (1942), tabulated this for the normal population for samples between « = 2 and 
n = 20. This latter paper also contains a table of several percentage limits of the range 
in samples from a normal population. These limits are derived from the numerical values of 
the probability integrals and replace the approximate values previously given by Pearson 
referred to above. 

Tippett (1925) and Pearson (1932) have pointed out that although the total range in a 
sample may be used for the purpose of estimating the population value of the standard 


PAGE 

41 

42 

43 


49 

62 

63 

64 

61 



42 


Range in place of standard deviation in the t-test 

deviation, a more efficient measure may be obtained by dividing the sample into random 
subgroups of equal size and using the mean range of the subgroups in place of the total range. 
The efficiency of range estimates of standard deviation is, of course, always less than that of 
roofc-mean-square estimates, but the work of Davies & Pearson (1934) and Pearson & Haines 
(1936) indicates that information is not discarded to any serious extent providing that the 
number of observations in the subsamples is not greatly in excess of about 10. 

As a result of the work outlined above, the range is now of considerable importance in 
many fields, especially in industrial quality control, where its simplicity has enabled it to 
be extensively and easily applied to the measurement of fluctuations in the variability of 
quality of a manufactured article or material. 

In the present paper an investigation is made of the use of range estimates of standard 
deviation in the consideration of the statistical significance of deviations of sample means in 
normal random sampling theory. This use of range estimates of standard deviation is 
analogous to the use of root-mean-square estimates in the well-known t-test. Tables are 
given, at several probability levels, and these may be employed in determining the statistical 
significance of either the deviation of a sample mean from some fixed or hypothetical popula- 
tion value, or the difference between the means of two samples. These tables may also be 
used for obtaining rapid estimates of the accuracy of a sample moan from the variation 
within the sample as measured by the range. The use of range, in place of root-mean-square 
estimates of standard deviation, in this modified form of the i-test necessarily entails some 
loss of precision, It will, however, bo shown in a future paper tliat this reduction in accuracy 
is negligible for all practical purposes. Furthermore, this slight disadvantage of the new test 
is compensated by its greater simplioity, involving a reduced amount of computing compared 
with the usual i-test. 

The range test is suitable for application to many problems frequently encountered in the 
treatment of various types of experimental data and in considering the mean character value 
in small samples in biological experiments. In the industrial field, the range test may be 
used for detecting changes in mean quality level, especially where the variation is not 
under strict statistical control or is subject to secular changes, or for determining whether 
the average level of a batch determined from a sample is in accordance with specification 
demands. A number of these problems are covered in the examples given at the end of the 
paper. 


(ii) The <-test* 

In testing the significance of the deviation of a sample mean r from an assumed population 
value use is made of the ratio 

where N is the size of the sample and s is the root-mean-square estimate of the population 
standard delation determined from the sample. In applying this ratio it is assumed that 
the N values form a random sample from a normal population of which the mean is 
standard deviation o' and the distribution of values of x is given by 

1 


* ‘Student’ (1908), B. A. Fisher (1925). 



E. Lord 


43 


More generally t may be defined as the ratio 

t = xjs, (3) 

where x and s are statistically independent, x being a quantity distributed normally about 
a mean of zero and s a root-mean-square estimate based on v degrees of freedom of the 
standard error of x. Although the use of the tables of the probability integral of t enables 
the most eificient tests to be made of the various forms of the so-called ‘ Student’s Hypo- 
thesis occasions frequently arise when more rapid tests are desirable, especially if accom- 
panied by only inappreciable loss of accuracy. The calculation of s, depending upon the 
squaring of numerical quantities, entails a certain amount of labour, especially if tables of 
squares or a calculating machine are not available. The use of the range, or the mean range 
determined from random subgroups in a sample, enables very rapid estimates to be made 
of the population value of the standard deviation cr. In the following section these range 
estimates are used in place of root-mean-square estimates in a modified form of the t-test. 


(iii) The modieied test (m-test) based on range 
Here we replace the .s of ‘ Student’s ’ ratio by an estimate of cr based on the range. Thus 


where -x, is a quantity distributed normally about a mean of zero and w(m, n) is the mean 
value of m ranges w, obtained from m independent samples or subgroups, each containing 
n observations. The constant d^, in a commonly used notation,* is the expected value of the 
range in samples of n, randomly selected from a normal population of mrit standard devia- 
tion. The ratio W{in, n)ld^ is therefore an estimate of the standard error of x obtained from 
range and, as such, replaces the root-mean-square estimate s used in the ratio t~ xjs. 

Except for a few special cases, it has not been found possible to determine the analytical 
form of the distribution of u, but several tables of percentage points have been computed 
for use in testing the various statistical hypotheses normally covered by the i-test. The 
computation of these tables is considerably simplified by first determining the percentage 
points of the distribution of the subsidiary quantity 


q = q{m,n) = 


w(to, n ) X 


d.. 




( 6 ) 


and the multiplying by the corresponding value of d^ to obtain the percentage points of the 
u distribution. 

To simplify the algebraic expressions in what follows, u, w and q will be written for u(m, n), 
w{m, n) and q{m, n) where no confusion is involved. 

The distribution of both u and q are clearly independent of cr. Hence, without any loss 
of generality, <r may be taken equal to unity in considering the distributions. The distribution 
of X will therefore be defined by 


* Sec, for example, Pearson (1936), pp. 84 and 90. 



44 RaThge. in place of standard deviation in the t-test 


Furthermore, let the distribution of the range in a sample of n he y = p{w), that of w 
hey - p{w), and that of g be 1 / = p{q). Then since x and w are defined to be statistically 
independent, we have the distribution of q given by 


p{q) = ^p{;LS)p{x)dwdx, 


( 7 ) 


where the integral is to be evaluated over the field, of all values of x and w subject to the 
relation (6) and to the conditions: ^ 

-oo<a:<c», 0<wi<oo. (8) 

Since x is distributed symmetrically about zero, and w is independent of x, the ratio q is 
also symmetrically distributed about zero. Let q^ be the value of q such that a is the chance 
that 1 ^ I ^ The quantity a represents the total area of the two equal tails of the dis- 
tribution lying outside deviations ± q^, and we have 

(•O'a 

(l--a) = 2j^ piq)dq. (9) 

Alternatively, fi:om (6), (7) and (8), this may be written in the form 
1 — a = J ^ j _ p(^x) dx\ dw 


fmt rvxia^ 1 \ 

" ‘'J. r"’]. 

Except for a few cases in which the analytical form of the distribution of u has been 
obtained, equation (10) has been used to compute values of q„ and, hence, the percentage 
points of u for values of a ~ O-IO, 0'06, 0-02, 0-01, 0-002 and O-OOl, with values of n from 
2 to 20 and values of m from 1 to 10, 16, 20, 30, 60 and 120. 

The percentage points of u - u(m, n) are first considered for the case when the estimate of 
standard deviation is based on the value of a single range of « random values (i.e. for m = 1). 
This treatment is followed by the case of w = 2, and finally consideration is given to the 
general case using estimates of standard deviation determined from the mean of m ranges 
each from an independent subgroup of n random values. 


(iv) Computation of perckntagb points of the disteibution of 
u - m(1, n), i.e. case with m - 1 

Throughout this section the estimates of standard deviation are all based upon the value of 
the range in a single set of n random values of the variate (thus w ~w). In the case of w = 2 
and «, = 3 analytical solutions are derived for the distributions from which the percentage 
points of q and u are calculated. For « ^ 4, percentage points in the neighbourhood of those 
desired are determined by quadrature methods, arid the required points obtained from these 
by interpolation. 

Special case n = 2, m = I 

The distribution of ranges (w) in samples of two random values from a normal population 
with unit standard deviation is the distribution of absolute differences between random pairs 
of variate values, and this may be easily shown to be 

p{w) dw = dw (0<w<oo). (11) 



E. Lord 


45 


Since x and w are independent, it follows from (6) and (11) that their joint probability 
distribution is 


p(vj,x)dwdx — — j-e-^^^+^^^Hwdx. 
TT^jl 


(12) 


Transforming to new variables, q == xjw and w and noting that the Jacobian of the trans- 
formation 

d{x, w) 
d{q, w) 

the joint distribution of q and w is given by 


= w. 


p(q,w)dqdw = — j-e-^^^^^+^'>wdqdw. 


(13) 


(14) 


To obtain the distribution of q it is necessary to integrate (13) over the whole field of w, 
from 0 to 00 . This gives 

Hence from (9) and (14) above, the percentage points of q are given by 

2 foa dq 


(l~a) 




= -tan-i(V2ff„), 


and hence 


1 


V2 


n 


3a(1.2) = -7-tan -(1-a) =-75C0t . 


1 


■V2' 


no. 


(16) 


The values of q^ determined from (16), for the six values of a under consideration, are 
multiplied by dg = 2j^n to give the required percentage points of the distribution of 
u = u{l,2). 

Special case n = Z, m — \ 

For random samples of size n = Z from a normal distribution with unit standard deviation, 
the distribution of the range has been found by McKay & Pearson (1933) and takes the form 

6 /•W/V6 

p(w) = — I 

7ry2 Jq 

Again, since w and x are independent, their joint distribution is given by 


p{w,x)dwdx ■■ 


TT^jn 


g-K»*+i’o*) dw dx 


J. ‘ 


e-i^^dz. 


(16) 


Transforming to new variables q — xjw and w, it follows from (16), since the Jacobian of 
the transformation is equal to w, that the joint distribution of q and w is given by 


„ ftu/v'6 

p{q, w) dqdw - —j- e~^^^^^+*^wdwdq e"*** dz, 

TT yTT J 0 


( 17 ) 



46 Range in place of standard deviation in the t-test 

To obtain the distribution of q the expression in (17) has to be integrated over the whole 
field of w, from 0 to oo. Tims 


p(g) 


irfiT. 




wlV<i 




Putting i = z,/6, the above may be written 


p(g) 


(•a 


7rV(67TjJo 






i- ,- c- 


7r^(07r)l_(9* + |) 

3 




m-^O 




7 r^( 67 r) (f/ + |)J 


e'-»-Vi'2e-i’'HiP+i)dw. 


(18) 


The first expression in (18) is clearly zero and hence 


As before 


74?) 


n,Ji&n) (g2+ 1) 


’CO 

g-iU' 

Jo 




V3 

27 r (?^ H - i ) ((/ + 1 )* ’ 


(1-a) 




"^Jo U 2 + 3?S)i(’ 


(19) 


and therefore 

(2+%I)l 

If tan (1 - a)| be denoted by r, then 

?a(1.3) 


tan 


TT 


(1-a) 


■v/2 r 

(l“‘ 3 T 2 )i' 


(20) 


The six required values of are found by substitution of the corresponding values of a in 
(20) above, and further multiplication by dg = gives the percentage limits of the 
distribution of w = «(!, 3). 


General case n > 4, m = 1 


Tor w ^ 4 no suitable algebraic expression exists for tlio distribution of the range, but 


r*w 

Pearson & Hartley (1942) have tabulated values of the probability integral dw to 

Jo 

4 figures at intervals of 0-06 of w for values of n from 2 to 20. Hartley kindly lent manuscript 
tables of the integral tabulated to 5 figures at intervals of 0-25 of w for values of n from 
2 to 16. Using these five-figure tables for u = 4, 6, 10, 16 and the four-figure tables for u — 20, 

the frequency distribution of w was obtained numerically by subtraction of successive values 
(*10 


of 


p(w) dw at intervals of 0-25 and then converting these class frequencies into ordinates 


J 0 

y(w). The degree of approximation in the formula used implied the vanishing of fifth 
differences (see K. Pearson, Tables for Statisticians and Biometricians, Part ii, p. xvii). 



E. Lord 


47 


Each case was treated in turn, and the six values of the percentage points q^, corresponding 
to the six different values of a under consideration, were determined using the relations 
given in (10) above. Taking a trial value of the integrals 

were calculated at intervals of 0-25 over the whole range of w. Quadrature was then applied 
to the products y{w) I{w, q„) over the range 0 < w < oo, to obtain the value of (1 — a) corre- 
sponding to the trial value of q^. This procedure was repeated a number of times to obtain 
values of (1 -'a) corresponding to a series of equidistant values of q^. The required values of 
q^ corresponding to the six values of a under investigation were then obtained by backward 
interpolation. 

As n CO the ratio wjd^ tends to the population value of the standard deviation. Further- 
more, for n. = 2 and n = 3, exact values of q^ had been previou.sly obtained by direct cal- 
culation for the six values of a. Thus it was possible to make initial estimates of the required 
values of q^, and the process of this ‘trial and error’ method was not found too laborious. 


Table 1. Framework values of percentage points of u = u{l, n) 


a 

n 

0-10 

006 

0-02 

0-01 

0-002 

0-001 

2 

6'0376 

10-1381 

25-389 

50-791 

263-97 

607-96“ 

3 

2-6936- 

3-8226+ 

6-188 

8-819 

19-84 

28-08 

4 

2-1793 

2-9506+ 

4-213 

6-420 

9-42 

11-76+ 

6 

1-9364 

2-4766+ 

3-249 

3-900 

6-71 

6-66 

10 

1-8064 

2-2390 


.3-244 

4-32 

4-82 

16 

1-7496 

2-1386+ 

2-628 

2-990 

3-82 

4-19 

20 (a) 


2-1083 

2-576 

2-916 

3-69 


20(b) 

1-7314 

2-1074 

2-676 

2-916 

3-69 

■■ 


1-6449 


2-326 

2-576 

3-09 

3-29 . 


The framework values of the percentage points of u[\,n) were obtained by multiplying 
the values of q^ by the corresponding values of the mean range d,, tabulated by Tippett 
(1925) and are given in Table 1, together with the exact values for w = 2 and n = 3 
determined above. 

As a check on the accuracy of determination of these percentage points, the six values 
for «, = 20 were also calculated by a second method. Writing 

r=llq = djx, (22) 

the method is to determine, for the six values of a, the corresponding values of q^ such that 


cc = 

J —00 


poo 


p{r)dr-h p(T)dr, 


JVQa 


(23) 


where y — p{r) denotes the frequency distribution of r. Since q is distributed symmetrically 
about zero, its reciprocal r is also distributed symmetrically about zero, and from (22) 
and (23) it follows that 


a ■ 


'■I 


p{r)dr 
V9ct 

p(x) dx 


xlQa 

0 


p{w) dw. 


(24) 









48 


Range in place of standard deviation in the t~test 
Ordinates of the normal curve y{x) ~ "weve taken at intervals of 0-25 for x in 

the range O^^aKoo from K. Pearson’s Tables for Statisticians and Biometricians, Part i. 

f*/«a 

Taking a trial value of 1 /?„, the integrals p{w) dw v?ere calculated from Hartley’s four- 

Jo 

figure tables for each value of x in the above range. By quadrature applied to the products 

p[x) p{w) iw the integral (24) was evaluated. By taking a series of equidistant values of 
Jo 

other trial values of a were determined. Backward interpolation was then used to 
obtain the required values of corresponding to the six values of a under consideration. 
Finally, the six percentage points of u were determined by multiplying the values of by 
dgo given in Tippett’s table. These percentage points are given in the penultimate line (b) 
of Table 1, and comparison with the corresponding values in the line above, (a), indicates 
good agreement between the two methods of computation. 

Since the percentage points of w for «. = 4, 6, 10 and 16 have been determined using 
Hartley’s five-figure manuscript tables of the cumulative frequency distribution of w, they 
should be at least as accurate as the percentage points for n = 20 determined from the 
four-figure tables. 

Changes in the percentage points at the different levels of significance run most smoothly 
if arguments proportional to Ijn are used in place of n, and reciprocals of u for the variate. 
Using a six-point general Lagrangian formula applied to the points corresponding to 
tt = 3, 4, 6, 10, 16 and 20, values of percentage points of « were determined for n = 6,7, 8, 14 
and 18. (In the case of n = 20 the mean values of the percentage points determined by the 
two methods were used.) The interval was then halved, using a nine-point Lagrangian 
through points corresponding to w = 4, 6, 8, 10, 12, 14, 16, 18 and 20. Finally, the six sets 
of percentage points were differenced as a check, reduced by either one or two figures and, with 
the exception of those for n = 6, are given in the second columns of Tables 3-8 under m = 1. 

For n = 5, the six percentage points of u were independently determined at a later stage 
of the investigation by the method used above for » = 4, 6, 10, 16 and 20, and it is these 
values which are given in Tables 3-8. In the table below, the values obtained by inter- 
polation from the framework values are compared with those determined by direct 
calculation. 


Percentage points o/ u( 1 , 6) 


a= O'lO 


By direct calculation 
By interpolation 



2-636 

2 - 636 + 



0-01 

0-002 

0-001 

4-38 

6-8 

8-2 

4-38 

6-8 

8-2 


In one case only, for a = 0-10, is there a difference as much as one unit in the last figure, 
the actual values obtained being 2-0192 by direct computation and 2-0198 by interpolation. 

Taking all the various checks into consideration, it appears unlikely that the values of 
the percentage points of u(l,n) given in the tables at the end of the paper are in error by 
more than One unit in the last place. The values for n = 2 and = 3 are, of course, exact. 












E. Lord 


49 


(v) Computation op' percentage points op the distribution op 
u = ii(2, n), i.e. case with m = 2 

The exact distribution of u(2, n) cannot be determined analytically except in the case of 
n = 2, and hence the various percentage points have necessarily to he evaluated almost 
wholly by numerical methods. 

The probability of the joint occurrence of a pair of ranges w' and w" from random samples 
of equal size n from a normal population of unit standard deviation is given by 

p{w', w") dw'dw" = p(w')p(w") dw'dw". (25) 

If w be the mean value of the two ranges, then its distribution is obtained by integrating (25 ) : 

p(w)did = ^p(w')p{w")dw' dw" , (26) 

the integration being taken over the whole field of w' and w" subject to the conditions 

w = ^{w’ + w'') (0<10'<QO, 0^w"<Qo). (27) 

We shall change the variables from w' and w" to w and w', the Jacobian of the transformation 
being equal to 2. With these new variables, and noting from (27) that w' varies from 0 to 
2w, equation (26) gives 

/*2u' 

p(JB) = 2 p{w')p{2w — w')dw'. (28) 

Jo 


Special case n — 2,m — 2 

In equation (11) above is given the distribution of the range in satnples of two random 
values from a normal population with unit standard deviation. Substituting this in (28) 
above, the distribution of means of two independent values of w is therefore given by 

C’Sb 1 1 _ 

n(w) = 2 4- dii)' 

Jo 

9 /•2u) 

Using the notation cm i 

the expression (29) for the distribution of the mean range may be written in the form 

We may now proceed to determine the distribution of the ratio g* = xjw. Since they are 
independent, the joint distribution x and W is, from (6) and (30), given by 

p{x, w) dxdw — e-**^'* I{w) dxdw. (311 

7T 


Biometrika 34 


4 



60 Range in place of slandard deiMion in the t-test 

TMBformi.* to ™»Ues , md O, noting that tho Jacobian of the tomformation is equal 

to w, and integrating, we obtain 

(32) 


P(?) 


4 r® 

TfJo 


wdw. 


Now since 

we make use of the identity 

d 


dljw) ^ 1 iff,* 

dw ■\/(2?r) 


I{Td)} = - w(q^ + 1) liw) + 




dio 

to evaluate the integral in (32) and obtain 

4 


p{q) = 


7rv(27r)(g^+l) 

9 , 


fco 

Jo 




^¥+1)V(2*+2)‘ 

The ratio q is distributed symmetrically about zero and from (9) and (33) we obtain 


(33) 


. 4^1* 

1-a = - f 


dq 


TrJo {q^+l)^(q^+2y 
and the percentage points are therefore given by 

?. = ?«(2,2) = {-^, 


( 34 ) 


where t = tan |j(l — a)| . 

Substitution of a = 0-10, 0-05, etc., in (34) gives the required values of q^, and further 
multiplication by d^ = gives the required corresponding percentage points of the 
distribution of w(2, 2). 


General case w > 3, m = 2 

For K. = 4, 6, 10, 18 and 20, the ordinates y{w), of the distribution of the range have been 
previously evaluated above at intervals of 0-26 for w, and these are used in place of the 
unknown p(w). Taking a particular value of w, quadrature was then applied to the products 
^j{w)y{2w-w) to obtain a numerical estimate ofp(iy) from equation (28). This process was 
repeated at intervals of 0-26 for w through as much of the range 0 < < oo as was necessary 
to obtain the required degree of accuracy. For » = 3, determination of the ordinates of the 
distribution of w was also based on quadrature, but, in this case, exact figures for the 
ordinates of the distribution of the range were obtained from its equation found by McKay 
& Pearson (1933). Because of the rapid rise of the distribution near to the origin, estimates 
of the lower values of p{w) in this case were determined using an interval of for w in order 
to obtain the requisite accuracy. For higher values of w the interval was progressively 
decreased to 0’25 used over the tail portion of the curve! 



E. Lobd 


61 


Treating in turn each curve of the distribution of w (for = 3, 4, 6, 10, 16 and 20), the 
values of the percentage points of the distribution of the ratio 


u X 



were computed by a method 'similar to that used in evaluating the percentage points of g 
for the case m — 1, Taking trial values of q^, the integrals 


2 

were calculated at intervals of 0-26 for w. Quadrature was then applied to the products 
yljd) I{w, q^) over the range 0 < w < oo to obtain corresponding values of (1 —a), the sum of 
the two tails of the distribution beyond deviations + q^. Repeating this procedure, a series 
of values of (1 — a) were obtained, corresponding to a set of equidistant values of q^. Back- 
ward interpolation was then used to obtain the six values of corresponding to the six 
values of a under consideration. Finally, the required percentage points of u were obtained 


Table 2. Framework values of percentage points of u = u{2, n) 


Bi 


mgm 

0-02 

0-01 

0-002 

0-001 

2 

2-6203 

3-8671 

6-266 

• 

8-932 

20-10 

28-47 

3 

2-0201 

2-6366+ 

3-666+ 

4-340 

6-78 

8-66 

4 

1-8760 

2-3672 


3-600 


6-80 

6 

1-7791 

2-1920 

2-727 

3-133 

'4-11 


10 

1-7226+ 

2-0926 

2-551 

2-884 

3-64 


16 

1-6961 

2-0470 

2-472 

2-776+ 

3-44 

3-71 

20 

1-6879 

2-0329 

2-449 

2-742 

3-38 

3-64 


by multiplication by the appropriate value of d^, and are given in Table 2, together with 
the exact values for n = 2 determined from (34). 

As before, Lagrangian formulae were used to interpolate the intermediate values of the 
percentage points of u, again taking arguments proportional to l/n and reciprocals of the 
percentage points for the variate. As a check, the values were inspected by determining 
differences up to the third order and then reduced by one place of decimals. With the 
exception of the values for w = 6, the reduced values are given in Tables 3-8. 

For n = 5 the six percentage points of u were independently determined at a later stage 
of the investigation by the same methods used for the framework values. These directly 
computed values are given in the tables mentioned above, and are also reproduced below 
for comparison with those obtained by interpolation from the framework values. 


Percentage points of u = u{2, 5) 


a == 

0-10 

0-05 

0-02 

0-01 

0-002 


By direct calculation 

1-814 


2-84 

3-29 

4-4 

6-0 

By interpolation 

1-'814 

■HB 

2-84 

3-29 

4-4 

4-9 















52 


Range in 'place of stansdo/td deviatio'n iw the t-test 

In the case of « = 0-001, direct calculation gives a value of 4-97 compared with 4-92 
obtained by interpolation. Por other percentage levels the agreement is exact to the number 
of figures quoted. 


(vi) OoMPTITATION OP PBEOENTAGE POINTS OP THE DISTRIBUTION OP 
11 = ii{‘m, n) POR m > 2 

The variance of the mean range in m subgroups of equal size n steadily decreases as m 
increases, and the ratio w(m,n)/d,^ gives closer estimates of the population value of the 
standard deviation of the variate. Hence, following the usualmethods of large-sample theory, 
the limiting values of the percentage points of u, for indefinitely large m, may be determined 
from integral tables of the normal curves. For a given value of a, the limiting values of the 
percentage points are, of course, equal for all values of n, and are also equal to the corresponding 
limiting values of the percentage points of Fisher’s t-distribution for an indefinitely large 
number of degrees of freedom. 

In general it was found for a particular value of a and of n that a three-point Lagrangian 
curve with 1 /m as argument and reciprocals of the percentage points of u as variate (passing 
through points corresponding to m = 1, 2 and oo) may be used for interpolation of the 
required percentage points corresponding to values of m intermediate between 2 and co. 
Only in the case of w = 2 and «. = 3 was the required accuracy not attained by this procedure 
and further investigation found necessary. Details of the methods used are given below. 

In the case oin = 2, the percentage points of the distribution of u{'m, 2) were also deter- 
mined for m = 4 and m = 8 as follows. First, considering m = 4, it was necessary to obtain 
numerical estimates of the ordinates of the distribution of the means of four ranges, each 
range from a random sample of two values from a normal population with unit standard 
deviation. Following the method used for m = 2 and leading to equation (28), it is easy to 
show that the distribution of w(2m,n), the mean of 2m independent ranges each from a 
random subsample of n values, is given in terms of the distribution of the mean, w(m, n), of 
m such ranges by 


p{w(2m,n)) = 2 


•2io(3wi, It) 

p{w{m, n))p{2Tv{2m, n) — w(m, n)) dw{m, n). 

0 


(36) 


Using numerical values for p{w) (m = 2, « = 2) given by equation (29) above, estimates 
of the ordinates of the distribution of p{w) for m = 4, » = 2 at intervals of 0- 26 were found by 
quadrature methods similar to those described in previous sections by using the above 
expression. A repetition of this process, using these last computed values, yielded numerical 
estimates of the distribution of the means of eight ranges, i.e. m = 8. Again applying quad- 
rature to the two distributions, values of (1 — a) were determined for a series of equidistant 
values of q^{m = i and m = 8). The required values of q^, corresponding to the six values 
of a between 0-10 and 0-001 under consideration, were then obtained by backward inter- 
polation, and hence the percentage points of 'm( 4, 2) and u{8, 2). 

The sets of six jpercentage points o£u(m, 2) were determined for each required value of w 
by Lagrangian interpolation, reciprocals of m being used as argument and reciprocals of 
the corresponding percentage points as variate, the curve passing through the points corre- 
sponding to wi = 1 , 2, 4, 8 and co. The interpolated values of percentage points obtained by 
this method, and the directly computed values for wj. = 4 and tn = 8, are given in Tables 3—8 



E. Lord 


53 


for a series of values of m suitable for practical use. As a check upon the method, the com- 
putation was repeated, this time using a four-point Lagrangian passing through points 
corresponding to m = 1, 2, 4 and oo. Most of these interpolated four-point values of the 
percentage points agree exactly with the five-point values previously obtained. In cases 
where differences arise, none exceed 1 unit in the last figure. The five-point Lagrangian 
method of interpolation therefore certainly appears to be quite adequate for furnishing the 
required degree of accuracy. 

Numerical estimates of the distribution of the means of pairs of ranges from subsamples 
of size w = 3 and n = 4 have already been obtained above. Using these in turn in equation 
(36), estimates of the ordinates of the distributions of the means of four ranges were deter- 
mined at intervals of 0'25 by quadrature methods. The sets of percentage points of ^(4, 3) 
and ■zt(4, 4) were then computed by the previous method of trial values and subsequent 
backward interpolation. 

For w = 3 and w = 4, the percentage points of the distribution of u, for given values of Wi, 
are lower and nearer to their limiting values than the corresponding points for n = 2. 
Furthermore, the changes in the values of the percentage points for small values of m are 
also less abrupt. In view of the agreement between the four-point and five-point Lagrangian 
interpolated values of the percentage points for n = 2, a four-point Lagrangian through 
points corresponding to m = 1, 2, 4 and oo may certainly be relied upon to give adequate 
accuracy for the interpolation of percentage points corresponding to intermediate values 
of m in the case of » = 3 and w = 4. These values, together with the computed values for 
m = 4, are given in Tables 3-8. 

For the remaining values of n, from 6 to 20, the interpolated percentage points given in 
Tables 3-8 have been obtained by means of a three-point Lagrangian curve, using values 
of the percentage points corresponding to values of m = 1,2 and oo. As in the previous cases 
of interpolation, reciprocals of m and the variate were used in order to obtain small changes 
in successive differences. To show that this method is adequate, the six sets of percentage 
points for n. = 4 were also interpolated using a three-point Lagrangian. In every case except 
one, these values agreed exactly with the four-point Lagrangian interpolated values pre- 
viously found and given in Tables 3-8. In the case of the sole exception, the difference 
between the two interpolated values was only one unit in the last figure. For the less rapidly 
changing values of the percentage points of u for n'^5, the three-point method of inter- 
polation therefore provides sufficient accuracy for the present purpose. 

Taking all checks into consideration it appears that the tabulated values of the per- 
centage points of the distribution of the function u = u{m, n) may be relied upon to the 
accuracy given; occasionally the values may be one unit in error in the last figure. 

In Tables 3 and 4, the 10 % and 5 % points of u were computed to 3 decimal places, 
but lack of space has necessitated these being curtailed for publication. For the same 
reason the values of the percentage points of u for the odd values ofw== 11, 13... 19 have 
been omitted. In practical applications of the test, it is not considered that this reduction 
will cause any undue inconvenience. Fuller tables have, however, been retained for 
forthcoming work on the power of the tt-test and are available for consultation if required. 

(vii) Approximate values op the pbeobntagb points oe u 
If there are m subgroups each of n values, and if the estimate of standard deviation is 
determined as the root-mean-square of the deviations of variate values from the respective 



54 


Range in place of standard deviation in the t-test 

means of the subgroups, then the nuinher of degrees of freedom is v = m{n— 1). Unlike the 
usual i-test, when the estimate of standard deviation is determined from the mean range in 
m subgroups of equal size n, the percentage points of the modified i-distribution investigated 
above depend upon the relation between m and n. Reference to Tables 3-8 indicates that, 
for a constant number of degrees of freedom v = 7n{n— I ), the values of the percentage points 
on a given probability level vary slightly as m and n vary. For example, taking a = 0-05 
and V = have the following percentage points; 2-272 for m = 1 and w = 9, 2-254 for 
m- 2 and n =? 5, 2-260 for m = 4 and n = 3, and 2-264 for m = 8 and n = 2. In general, 
however, the range in the values of the percentage points of u for a given value of v is small, 
and this permits the construction of a table giving approximate values of the six sets of 
percentage points corresponding to different numbers of degrees of freedom. 


Approximate values of percentage points of u 


Degrees of 
freedom 

Value ,? of a 







V = wi ( n — 1 ) 

0-10 

0-05 

0-02 

0-01 

0-002 

0-001 

1 

6-0 

10- 1 

26-4 

60-8 

254-0 

507-9 

2 

2-6 

3-8 

6-2 

8-9 

19-9 

28-3 

3 

2-2 

30 

4-2 

5-6 

9-4 

11-8 

4 

2-0 

2-6 

3-6 

4-4 

6-8 

8-3 

6 

1-9 

2-5 

3-3 

3-9 

6-7 

6-7 

6 

1-9 

2-4 

3-1 

3-6 

6-1 

6-9 

7 

1-8 

2-3 

3-0 

3-6 

4-7 

6-4 

8 

1-8 

2-3 

2-9 

3-3 

4-5 

6-0 

9 

1-8 

2-2 

2-8 

3-2 

4-3 

4-8 

10 

1-8 

2-2 

2-7 

3-1 

4-2 

4-6 

11 

1-8 

2-2 

2-7 

31 

4-1 

4-6 

12 

1-8 

2-2 

2-7 

3-1 

4-0 

4-3 

13 

1-8 

2-2 

2-7 

3-1 

3-9 

4-3 

14 

1-7 

2-1 

2-6 

3-0 

3-8 

4-2 

15 

1-7 

2-1 

2-6 

2-0 

3-7 

4-1 

16 

1-7 

21 

2-6 

2-9 

3-7 

4-1 

17 

1-7 

2-1 

2-6 

2-9 

3-7 

4-1 

18 

1-7 

^-1 

2-6 

2-0 

3-6 

4-0 

19 

1-7 

2-1 

2-6 

2-9 

3-6 

4-0 

20 

1-7 

2-1 

2-5 

2-8 

3- 6 

3-9 

30 

1-7 

2-0 

2-4 

2-7 

3-4 


60 

1-7 

2-0 

2-4 

2-7 

3-2 

3*5 

120 

1-7 

2-0 

2-4 

■ 2-6 

3-2 

3-4 

CO 

1-64 

1-96 

2-33 

2-68 

3-09 

3-29 


Tor a particular pair of values of m and n, the values of the percentage points for 
V = w(n - 1 ) degrees of freedom given in the table above are generally not in error by more 
than one unit m the last place of figures. This degree of accuracy is frequently sufficient for 
many practical applications of the distribution oiu. To settle the significance of cases giving 

va ues of « close to the above approximate values, reference should be made to the accurate 
values given in Tables 3-8. 


(viii) Apflioations o® the u-test 

The between the mean of a sample of n random values of a normally distributed 

variate and the population value is shown in the Appendix to be independent of the total 



B. Lord 


65 


range in the sample, and also independent of the mean range determined from random sub- 
groups of values. The modified <-test based on range estimates of standard deviation may 
therefore be used in various statistical tests of significance involving deviations of sample 
means. The application of this range test to sampling problems is analogous to that of the 
well-known ^-test, and no detailed description is therefore required. The most frequent use 
of the new test will be found in the treatment of experimental data of various types, and 
also in the examination of test results recorded for the purpose of control of the quality of 
industrial products. In this latter type of work, cases frequently arise when it is desirable to 
apply a rapid test for determining the significance of a difference between the mean of a 
sample and some preassigned value, frequently some desired control level, or the significance 
of the difference between two sample means. Furthermore, for routine purposes, it is often 
desirable that the test should not only be rapid but also of a simple nature, thus enabling it 
to be used by workers with little mathematical or even arithmetical aptitude. The new range 
test has the advantages of greater simplicity and greatly reduced amount of computing 
compared with the standard t-test. The use of range estimates of standard deviation, in 
place of root-mean-square estimates, necessarily entails some loss of precision, but in a future 
paper it will be shown that this reduction in accuracy is small and certainly negligible for 
most practical pixrposes. 

The most frequent applications of the range test are considered below and are followed 
by several numerical examples in which, for purposes of comparison, the parallel treatment 
by the 4-test is also given. As in the 4-test, the application of the range test involves the 
assumption of normality of variate distribution and randomness of sampling. Furthermore, 
where the standard deviation is estimated from the mean range of several subgroups of 
values, care should be taken to ensure that the arrangement of these values is also random. 
This latter condition is usually fulfilled by considering the values in the order in which they 
were originally recorded. In a few cases, however, the order of recording may not be random ; 
the particular circumstances of a test may be such that the order of the observations may be 
wholly or partly dependent upon their magnitude. In such cases a set of values can be 
divided into random subgroups by the use of tables of random sampling numbers or by 
other means. 


(a) Difference between sample mean and population mean 

Suppose we have some preassigned value and wish to test whether the mean S of a 
sample of N values may be considered as a reasonable estimate of or whether the difference 
between x and ^ is real in the statistical sense. The usual assumption, the so-called ‘ Student’s 
Hypothesis’, is made that ir is the mean of a random sample from a normal population of 
which the mean is g and standard deviation is cr. The differences {x ~ will be distributed 
about a mean of zero with a standard error equal to o-j^jN. If the sample be divided into m 
random subgroups of equal size n,N nin, and w is the mean of the m ranges of the subgroups, 
then the sample estimate of the standard error of the mean is wl{d,^^jN). The ratio of the 
difference between the means to the estimate of its standard error is 

( 36 ) 

W ■ ^ ' 

If the computed value of u exceeds the corresponding percentage point in one of Tables 
3-8, then the difference is considered unlikely to have arisen through random sampling on 



56 Range in place oj standard deviation in the t~test 

that particular probability lovel a. As in the case of the i-test, when, considering the asym- 
metrical case of ‘Student’s Hypothesis’, the values of a at the headings of the tables should 
be halved. 

For fairly small values of N, the estimate of the standard error of the mean may be deter- 
mined, not from the mean range in subgroups, but from the total range between the maximum 
and minimum values in the sample. In the notation used above, this corresponds to w == 1 
and n = N. The test of the significance of the difference may be made as above and the 
computed value of u compared with the percentage points in Tables S-8. In these cases, 
however, the computation may be curtailed by using the ratio 

(37) 

w d,^.Jn 

where \ and w is the range in the undivided sample. Table 9 gives values of the 

ratio (S/w for various levels of significance corresponding to the sum of the two tails of the 
distribution. For a chosen level of significance the difference S is considered too large to 
have arisen through random sampling errors if the value of d/w exceeds the corresponding 
tabulated value. Table 9 will also be found useful for giving a rapid estimate of the accuracy 
of the mean based on a snaall number of observations. 


(6) Difference between two sample means 

Suppose the first sample of size be divided into random subgroup,? of size n, and the 
second sample of size A'a be divided into random subgroups also of size n, i.e. 


n == NJrrii = JVg/ma. 

The hypothesis is made that each sample can be considered as a random selection from the 
same normal population. Let the numerical value of the difference between the two sample 
means be j ], and- the mean of the (mi + m^) ranges of n values be m = w(mi + m^,n), 
giving an estimate wjd^ for the standard deviation of the variate. The ratio of the difference 
between the two sample means to the range estimate of the standard error of the difference is 


■ ^8 1 dn 


Ihe significance of the difference between the means in any particular case can be deter- 
mined by noting whether the computed value of u exceeds the corresponding percentage 
point for a chosen value of a bj^ reference to Tables 3—8, using the column headed m = Wj + 
When the samples are small and of equal size, say n, the variate standard deviation cair 
be estimated from the two total ranges in the samples. If w' and w" are the two ranges, with 
a mean value Tv = ^(w' + w"), then 


u = 


Xt —a:. 




1V 


(39) 


may be used as above for te, sting the significance of the difference between the two means. 
A more ra]ud test may, however, be made by simply determining the value of the ratio of 
the difference between sample means to the average of the two sample ranges 

I ^ 1-^8 1 

l{w' + w") 

In Table 10 are given values of the above ratio lying on six different probability levels. For 



E. Loed 


57 


a given level of significance a, values of the ratio smaller than those tabulated may be con- 
sidered to have arisen through random sampling errors; greater values indicate that a given 
difference is unlikely to have arisen through chance and therefore point to a real difference. 

In the computation of u it is necessary to use values of the mean range in samples from 
a normal population of unit standard deviation. A selection of the values determined by 
Tippett (1925) is reproduced in Table 11 to avoid the necessity of frequent reference to his 
original paper, and is accompanied by the corresponding values of and ^n. 

(c) Confidence intervals 

As with ‘Student’s’ test, the tables of percentage points may be used to estimate with 
a given measure of confidence, the interval within which it can be stated that g or 

Examples 

Example 1. The following data' have been previously used as an example by ‘Student’ 
(1908). Ten patients were treated with the optical isomers of hyoscyamine hydrobromide 
and the additional hours of sleep were noted. 


Additional hours sleep gained by use of hyoscyamine hydrobromide 


Patient 

Dextro -(D) 

Laevo-(L) 

Difference (D — L) 

1 

-(-0-7 

4-1-9 

-tl-2 

2 

-1-9 

0'8 

4-2-4 

3 

-0-2 

4-1-1 

4-1-3 

4 

-1-2 

4-0-1 

+ 1-3 

6 

-0-1 



6 

4-3-4 

4-4-4 


7 

+ Z-1 

4-5-5 

4-1-8 

8 

-fO-8 

4-1-6 

4-0-8 

9 

0-0 

4-4-6 

-f- 4-6 

10 

4-2-0 

4-3-4 

4-1-4 

Means 

4-0-76 

4-2-33 

4-1-58 


The last column may be used for the controlled comparison of the two drugs, since their 
effects were measured on the same ten patients. The laevo form has given a greater figure for 
the additional hours sleep than the dextro form. Whether the former may be considered as 
the better soporific is examined by both the standard deviation and range tests. 

(а) The sum of squares of deviations of the differences about their mean value is 13'616, 
associated with 9 degrees of freedom. The estimate of the standard error is therefore 0-3890, 
and the value of t works out to be l-58/0'3890 = 4-06. For 9 degrees of freedom a value of 
t = 3'250 lies on the 1 % level of significance. Assuming normal random sampling, a value 
of t equal or greater than 4-06 will occur much less frequently than once in a hundred times. 
This leads to the conclusion that the laevo form is better for producing sleep than the dextro 
form. 

(б) For examination by the range, the value of M = w(l, 10) may be computed, but in this 
case it is simpler to use the shortened method of equation (37). The ratio of the mean differ- 
ence to the range in the ten individual difllerenoes is Sjv) = 1-68/4-6 = 0-34. Reference to 
Table 9 shows that this value is slightly in excess of the tabulated value 0-333 on the 1 % 
level of significance, leading to the same conclusion as that drawn from the i-test. 







5g Mange in place of stando/rd deviation in the t-test 

The greater significance suggested by the i-test seems to be largely due to the exceptional 
difference D-L for Patient No. 9, viz 4'6, which affects s more seriously than w. 

Example 2. In the calibration of a viscometer it is necessary to time the interval required 
for the level of an aqueous solution of glycerol to fall between two fixed marks, For satis- 
factory calibration it is considered desirable that the mean time of flow should be accurate 
to ± I sec., risking a greater error not more frequently than 1 in 20 times. Five independent 
determinations of the time interval (in seconds) for one viscometer were 103-5, 104-1, 102-7, 
103-2 and 102-6. While this number of observations is clearly too small for a final asses.sment 
of accuracy, it is often useful to get an interim answer to guide further action. 

[a) The sum of squares of the deviations of the five observations about their mean is 
1-50B, associated with 4 degrees of freedom, giving an estimate of the standard error of the 
mean equal to 0-275. Reference to tables shows that a value of t equal to 2-776 lies on the 
5 %, level of significance. Hence in 1 9 times out of 20 it would be expected that a sample mean 
will not diverge from the true mean value by more than ± 2-776 x 0-275 = + 0-76 sec. This 
error exceeds the assigned limits of sec. and therefore points to the necessity of further 
tesits to fulfil the required conditions. 

[b] Instead of computing an estimate of the standard error of tlie mean from the range 
{w = 104-1 - 102-6 =s 1-5) in the five determinations, we note from Table 9 that a value of 
Sjw = 0-507 lies on the 5% level of significance. Hence in 19 times out of 20 the sample 
mean will differ from its true value by an amount up to a deviation of 

±6 = ± 0-507 X 1-5 = ±0-76, 
a result in agreement with that yielded by the 1-test. 

Example 3. In the processing of raw cotton, modifications were made in the design of one 
of the machines with the object of improving the efficiency of cleaning. Teste were made 
on a series of 24 different mixings for the purpose of determining whether yarn strength was 
adversely affected by the mechanical alterations. The results of the 24 pairs of comparisons 
are given below (the strength being expressed as a count x strength product), together with 
the differences between them expressed as percentages of the corresponding strengths under 
standard conditions. 


Yarn strengths wnAer standard and modified conditions 






E. Lord 


59 


The mean value for the percentage difference in strength is - 0-46. Whether this is an 
indication that the mechanical modifications have resulted in the production of weaker 
yarns is examined by means of the standard deviation and range tests. 

(а) The sum of squares of the deviations of the percentage differences about their mean 
value is 256-48, based on 23 degrees of freedom. The estimate of the standard deviation of 
the percentage dijfferences is 3-34 and the standard error of their mean value is 0-68, giv ing 
a value of t = 0-46/0'68 = 0-69. This is much below the value of 2-069 on the 5 % level of 
significance and leads to the conclusion that there are no grounds for suspecting that the 
mechanical alterations have led to the production of weaker yarns. 

(б) The number of observations place this case outside the range of Table 9, and it is 
therefore necessary to use the modified i-function. The data are arranged in random order 
of their occurrence, and spht into four groups of six, The ranges in the sets of six differences 
are7-5,9-8, 1 2- 8 and 5- 9 with a mean value to(4, 6) = 9-0. The estimate of the variate standard 
deviation is m(4, 6)/dg = 3-56, giving 0-72 for the standard error of the mean percentage 
difference, and u s= 0-46/0-72 = 0-64. The 6 % level of significance is, from Table 4, equal to 
2-07, much greater than the value computed from the data and therefore indicates the same 
conclusion as above. 

Example 4. Independent determinations of percentage trash content were made in tri- 
plicate on two samples of raw cotton and the following results obtained: 


Percentage trash content of raw cotton 


Sample A 

Sample B 

1-13 

0-76 

1-31 

0-64 

1-26 

1-01 

Means 1-23 

0-80 


The point to be decided is whether sample B may be said to be cleaner than sample A, 
or whether the difference between the two average percentage trash contents may be 
accounted for by random experimental variation. Since, in this case, the comparisons are 
not paired, the standard error of the difference between the mean values of the two samples 
has necessarily to be estimated from the variation within each of the two sets of results. 
As before, normal variation in sampling and in testing errors is assumed. 

(а) The sum of squares of deviations of each set of values from their mean is 0-01680 for A 
and 0-07167 for B, each associated with 2 degrees of freedom. The best estimate of the error- 
standard deviation is therefore 0-149, giving 0-122 for the estimate of the standard error 
of the difference between the two means. The value of t is equal to (1-23 - 0-80)/0-122 = 3-6 
which exceeds the value 2-776 obtained from tables for 4 degrees of freedom and a = 0-05. 
On this level of significance the result is taken to indicate a real difference in the cleanhness 
of the two cottons. 

(б) The difference between the two sample means is 0-43 and the mean of the two ranges 

is 0-276. Hence, using the ratio of equation (40), | * 1-^2 = 1-6 which, from 

Table 10, is seen to exceed the value of 1-272 lying on the 5 % level and therefore is taken to 
indicate a significant difference in the mean values. 







^dngc in placB of stttndctiTd d&vwtion in ths t-tBst 

Example 5. The Mlomiig steength test results were obtained on two batches of cotton 
yam (measurements recorded to the nearest | lb.) and are noted downwards in order of 

random occurrence; 


Sample A 

Sample B 

30-5 

31-0 

29-6 

27-0 

28-5 

28-0 

31-6 

27-5 

28-6 

25-0 

29-6 

30-0 

28-0 

26-6 

28-0 

28-0 

27-5 

20-0 

27-0 

27-5 

28-6 

29-5 

28-5 

27-0 

27-5 

29-5 

27-6 

26-6 

28-6 

28 -S 

27-6 

28-0 

27-0 

28-0 

28-0 

28-0 

32-6 

30-0 

28-0 

26-0 

29-5 

28-5 

28-6 

26-0 

26-5 

30-6 

29‘0 

31-0 

29-0 

28-0 


The mean of sample A is 28-90 lb. and 27-40 lb. for sample B, and the question arises as 
to whether sample B is actually weaker than A. 

(a) The sums of squares of the deviations about their respective mean values are 68-2 
for A and 24-8 for B, associated with 29 and 19 degrees of freedom. The estimate of the error 
standard deviation is therefore 1-392, giving 0-402 for the standard error of the diiference 
between the two means and a value of t equal to (28-9 - 27-4)/0-402 = 3-7. Tor 48 degrees of 
freedom a value of t = 2-08 lies on the 1 % level of significance. The greater value of 3-7 
yielded by the data above indicates , therefore, that the difference in strength of the two yarns 
may he accepted as statistically significant. 

(b) The estimate of the error standard deviation is obtained from the ranges within groups 
of ten values, three groups for sample A and two for sample B. The values of these five 
ranges are 3-0, S-0, 5-0, 4-0 and 3-5 with a mean value of 4-1 and a corresponding estimate of 
error standard deviation equal to ®(6, 10)/dio = T33. The estimate of the standard error of 
the difference between the two means is 0-384 and the value of u is (28'9- 27-4)/0-384 = 3-9. 
For five ranges of ten, the value of u on the 1 % level of significance is, from Table 6, equal to 
2-69 (cf. 2-68 for t with 48 degrees of freedom). The value of 3-9 obtained from the data is 
greater than this value of 2-69 and this again leads to the conclusion that the difference in 
mean strengths of the two yarns is ‘statistically significant’. 

Note added in proof. Since the present paper went to press, a note by Daly (1946) has 
been published, in which it is suggested that the range may be used in place of the root- 
mean-square estimate of variance in a test analogous to the t-test. The case where the 
estimate of standard deviation from a single range is discussed and values of the ratio 
(deviation) /(range) on the 10% level of significance are given to two significant figures 
for a number of low values of n. These agree with the corresponding values given in 
Table 9, for a = 0-10, of the present paper. [Mr Lord’s paper was first submitted for 
publication in August 1945. En.] 




E. Lord 


61 


APPENDIX 


On the. independence, of mean and some linear eMimates of standard 
deviation in random samples from a normal population 

In the above practical applications of the u distribution to normal random sampling pro- 
blems, it has been implicitly assumed that range estimates of standard deviation, like root- 
mean-square estimates, are independent of the mean of the sample from which they have 
been determined. The validity of this assumption is established below, where it is shown as 
a particular case of a more general theorem. 

Consider a set of n random values of a variable from a normal population of distribution 

Of such a set let Xj, and Xg denote thepth and gth values (p < g) in ascending order of magni- 
tude, and denote the remaining {n — 2) values such that 


r = (p-f 1), (p-f 2), ..., (g- 1), i (2) 

Xg^Xr<co, r = {q+l),(q + 2),...,n. J 

Now a set of any n values may be arranged in n ! ways and in random samples all arrange- 
ments of the same n values are of equal probability. The group of (p — 1) values all less than 
Xjj are not ranked in any particular order, and there are hence (p - 1) ! ways in which they 
may be arranged. Similarly, the group of values from to may be arranged in 
(g— p— 1) ! ways and the third group from Xg^^^ to a:,^ in {n—q ) ! ways. The distribution of 
random samples in which the pth and gth values in ascending order are denoted by 
and Xg, and the remaining values satisfy the conditions in (2), is therefore given by 


p(a:i, Xg, . . . , x,,^) dxidx^ . . . dx„, = 


{p-l)!(g-p-l)!(?^-g)! 


^ r 2 ”] 

(T,, jEw* [ - 2V> ,5, ■ ■ ■ 


dx„ 


(3) 


where the constant term in brackets makes the total frequency of all such samples equal to 
unity. 

The joint distribution of the sample mean {x) of the n values and the difference A - [Xg — x^f) 
is given by 

p%S)MiA - 

where the multiple integral is evaluated over the domain of the a;’s conditioned by the limits 

2 r^n 

indicated in (2) and by .d = {Xg — xf) and x = - S 



62 Range in place of standard deviation in 

Make the transformation to variables defined by 

■■ , ^ 
{a:i+a: 2 + ... +a:„), 


the t-test 


_ 1 
x~- 
n 


2/i = -aJp+cBi, 

!/lt=-Xp + X2, 


%-! 

1/p+i 


~ —x. 


+^p~l> 


+^p+l, 


Vn 


(S) 


+ ^)t- 

The Jacobian of the transformation is _ i from (2) 

and (5), the transformed limits of integration are 
-co<y,^0, f = 

^'= (i5 + l),(p+2)>...,(g'-l), 

A^y^<co, r = (q + l),{q+2),...,n. 

2/a = ^- 

Using the relations in (5) above it may easily be shown that 

T>=b1 


( 6 ) 


r-i'' =' ' % Si 5i 

where in the summation on the right f,s,s+t^p. With the new variables we have, fror 
(^). (5)> (®) and (7), the joint distribution of $ and A given by 

r* r* /— t-LA— ■ „ 


(7; 


p(x,A)dxdA 


exp 


r n{x-if 
2(r^ 


L 


~dx 


■ ^ JP- 


Jn{n~l)\dA 


~ crffn j 1.)', yfi, — qjn^z,7rp'''‘^~‘^'cr' 

1-. '*"■ ■■■/-, -1. |/y„. ■■■/*»*?[- 2 ;^. 

1 ]%,,' 


(, « s=n-l/,=7i-s 

Sr Si 

wi^ the restriction that r,s,s + t:^p, and A is to be substituted for 
thereLrth^+h bracket of (8) is the distribution of the sample mean x. It follow 
d“ot inJl^ r distribution of .1 . because this expressic 

a norlalZlu^^^^^^ ^ random sampL fro: 

is iXende^t S 1 ' ^^der of magnituc 

staMaTdelln 'Tf «««mates of the populatic 

corresponding sample mear measures of dispersion) are independent of tl 

betimes thTdilSoI^etw " ^ ^ '^'®^®nce between the pth and ?th valm 

more.iftttSsr^^^^^^^^ 

argument shows that there is also statistinnl simple extension of tl 

corresponding mean range of tSXou;" 




Table 4. 6 % poirds ofu = u{m, n) 
























its to 


64 


in place of standard deviation in the t-test 


Table 6, 2 % <>/ ^ = 



Table 6, 1 % points of u=^ n{m, n) 
























E, Lobd 


65 


Table 7. 0-2 % points ofu=== u^m, n) 


\ m 

n \ 

1 

2 

3 

4 

5 

6 

S 

10 

16 

20 

30 

2 


20-1 


6-8 

6-7 

5-1 

4-6' 

4-1 

3-7 


3-4(2) 

3 

19‘8 

6-8 


4-6 

4-2 


3-7 

3-6 

3-4 


3'2 

4 

9-4 

61 


3'9 

3-7 

3-6 

3-6~ 

3-4 

3-3 

3-2 

3-2(1) 

5 

6-8 

4-4 

3-9 

3-7 

3-6+ 

3-6- 

3-4 

3-3 

3-2 

3'2 

3-2(1) 

6 

5'7 

4-1 

3-7 

3' 6+ 

3-4 

3-4 

3-3 

3-3 

3-2 

3'2 

3-1 

7 

6a 

3-9 

3-6 

3-6- 

3-4 

3-3 

3-3 

3-2 

3-2 . 

3-2 

3-1 

8 

4-7 

3-8 

3-6+ 

3-4 

3-3 

3-3 

3-2 

3-2 

3-2 

32 

3-1 

9 

4'6 

3-7 

3-5- 

3-4 

3-3 

3-3 

3-2 

3-2 

3-2 

3-1 

3-1 

■■ 

4-3 

3'6 

3-4 

3'3 

3'3 

3-3 

3'2 

3-2 

3-2 

3-1 

3-1 

12 

4-1 

3'6+ 

3-4 

3-3 

3-3 

3-2 

3-2 

3-2 

3-1 

3-1 

3-1 

16 

3-9 

3-6- 

3-3 

3-3 

3-2 

3-2 

3-2 

3-2 

31 

3-1 

3-1 


3-7 

3-4 

33 

3-2 

3-2 

3-2 

3’2 

3-1 

3-1 

31 

3-1 


Table 8. 0- 1 % points ofu = u{m, n) 


\ w 

n \ 

1 

2 

3 

4 

6 

6 

8 

10 

15 

20 

30 

2 

607'9 

28-5- 

11-7 

8-1 

6-7 



4-6 

4-1 

3-9 

3-7 (2) 

3 

28-1 

8-7 


6-1 

4-6 


Ksl 

3-9 

3-7 

3-6 

3-5(1) 

4 

11-8 

6-8 

4-7 

4-3 

41 

3-9 

3-7 

3-6 

3-6-t- 

3-5-- 

3-4(1) 

6 

8-2 


4-3 


3-8 

3-7 

3-6 

3-6 

3-5- 

3-4 

3-4(1) 

6 

6-7 

4-5-1- 


3-8 

3‘7 

3-6 

3-6-f- 

3-6- 

3-4 

3-4 

3-4(1) 

7 

6'9 

4-3 

3-9 

3-7 

3-6 


3-6-^ 

3-6- 

3-4 

3-4 

3-3 

8 

6-4 

4-1 

3-8 

3-7 

3-6 

3-6-^ 

3-6- 

3-4 

3-4 

3-4 

3-3 

9 

6-0 


3-8 

3-6 

3-6 

3-5+- 

3-6- 

3-4 

3-4 

3-4 

3-3 

10 

4-8 


3-7 

3-6 

3-5+ 

3-5 

3-4 

3-4 

3-4 

3-4 

3-3 

12 

4-6+ 

3-8 

3-6 


3-6 

3-6- 

3-4 

3-4 

3-4 

3-3 

3-3 

16 

4-2 

3-7 

3-6 

3-6+ 

3-6- 

3-4 

3-4 

3-4 

3-3 

3-3 

3-3 

20 

4-0 

3-6 

3-6-^ 

3-6“ 

3-4 

3-4 

3-4 

3-4 

3-3 

3-3 

3-3 


Note: The numbers in brackets in the column headed m = 30 indicate the number of units which must 
be subtracted in the first decimal place to obtain the level for m = 60 and the same value of n. Where 
no figure is given M(60,ra) = m(30,w) to first decimal place accuracy; u{120,n) = M(60,n) for (i) all 0-2 % 
points except that m( 120, 3) = 3'1 and (ii) all O-l % points except that m( 120, 2) = 3-4 and m(120,3) = 3-3. 



































The table gives values of the ratio — = lying on different levels of significance, 

U) range in sample ° 

the levels being the sum, at, of the two tails of the probability distribution. 


Table 10. Table for testing the significance of the difference between 
the means of two small samples of equal size .n 



0-10 

0-06 

2-322 

3-427 

0-974 

1-272 

-644 

0-813 

-493 

■613 

0 - 406 + 

0-499 

-347 

•426 

-306 

•373 

- 276 - 

-334 

•260 

•304 

0-233 

0-280 

•214 

•260 

•201 

■243 

-189 

•228 

-179 

•216 

0-170 

0 - 206 - 

•162 

• 196 + 

• 166 + 

•187 

-149 

•179 

•143 

•172 



TU o, „.,o M ly.„, o. „™,. o, 

si^ii^ance. The levels are the sum, a, of the two tails of the probability distribution. 

j- ’ ^3'^" considering deviations in the positive (or negative) direction only, the values of oc at the 
headings of the columns should be halved. 












E. Lobd 


67 


Table 11 


n 

d„ 



d„^ 

2 

1-1284 

0-8862 

1-4142 

1-6968 

3 

1-6926 

-6908 

1-7321 

2-9316 

4 

2-0588 

-4857 

2-0000 

4-1176 

6 

2-3269 

-4299 

2-2361 

6-2009 

6 

2-6344 

0-3946 

2-4495 

6-2080 

7 

2-7044 

-3698 

2-6468 

7-1661 

8 

2-8472 

-3612 

2-8284 

8-0631 

9 

2-9700 

-3367 

3-0000 

8-9101 

10 

3-0776 

-3249 

3-1623 

9-7319 

11 

3-1729 

0-3162 

3-3166 

10-6232 

12 

3-2586 

-3069 

3-4641 

11-2876 

13 

3-3380 

-2998 

3-6066 

12-0281 

14 

3-4068 

•2935 

3-7417 

12-7469 

15 

3-4718 

■2880 

3-8730 

13-4463 

16 

3-6320 

0-2831 

4-0000 

14-1279 

17 

3-6879 

•2787 

4-1231 

14-7932 

18 

3-6401 

•2747 

4-2426 

16-4436 

19 

3-6890 

•2711 

4-3689 

16-0798 

20 

3-7350 

•2677 

4-4721 

16-7032 


REI’ER.BNCES 

Daly, J. F. (1946). Ann. Math. Statist. 17 , 71. 

Davies, O. L. & Peabson, E. S. (1934). J. Hoy. Statist. Soc. Sv/ppl. 1, 76. 
Fisher, R. A. (1926). Matron, 5, 90. 

Habtlby, H. O. (1942). Biometrika, 32, 334. 

MoKay, a. T. & Peabson, E. S. (1933). Biometrika, 25, 416. 

Pearson, B. S. (1926). Biometrika, 18, 173. 

Peabson, E. S. (1932). Biometrika, 24, 404. 

Pearson, E. S. (1936). The Application of Statistical Methods to Industrial 
Standardization and Quality Control. B.S. No. 600. London: British 
Standards Institution. 

Peabson, E. S. & Haines, Joan (1935). J. Roy. Statist. Soc. Suppl. 2, 83. 
Peabson, E. S. & Hartley, H. O. (1942). Biometrika, 32, 301. 

‘Student’ (1908). Biometrika, 6, 1. 

Tippett, L. H. C. (1926). Biometrika, 17 , 364. 


5-2 




[ 68 J 


THE FREQUENCY DISTRIBUTION OF V^i SAMPLES OF ALL 
SIZES DRAWN AT RANDOM FROM A NORMAL POPULATION 

By R. C. GEARY 


1. Introduotory 

A research on which the writer has been engaged for some years has so far yielded the 
following results : 

(1) Testing for normality has a greater practical importance than statisticians (including 
the writer) have been disposed to accord to it; actual probabilities may be seriously at 
variance with probabilities derived from the well-known tables computed on the hypothesis 
of universal normality; in consequence, testing for ■ normality and, where necessary, 
correction (even if rough and tentative) for suspected universal non-normality, should 
become a part of statistical routine. 

(2) For large samples, and 6^ are the most efficient of large fields of tests of skewness 
and kurtoais, respectively, amongst large fields of alternative universes. 

These matters will be dealt with in detail in subsequent papers. It seems, in the first 
instance, desirable to derive the frequency distribution of for normal random samples 
of all sizes, partly on account of the inherent importance of the problem, partly in order to 
explore a computational technique which might be found effective in solving the analogous 
but probably more difficult b 2 problem. 

Towards the solution of the problem there are available the exact values of first four even 
moments— the odd moments are, of course, zero — of normal the second, fourth and 

sixth having been determined by R. A. Fisher (1930) and the eighth by Joseph Pepper 
(1932). It may be useful here to set out the four moments. Taking 


= mg/ml = | - S)^}^ , 


( 1 - 1 ) 


where n is the sample number, we have 


a 

^ («.+ l)('H. + 3)’ 

^ 108(w-2)(!aH27m-70) 

* 1 )'(m + 3Hn 4-' 6) (n + 7y'(^5T9) ’ 

3240(ri,-2) (uH 84Ti^-p2695-ri,^- 15l6ffii + 20020) 

® (w-t- 1) (M.-I-3) (tc-I- 6) (w + 7) (■a-i-9) (m-p 11) (rap 13) (ra-i- 15) ’ 

7 .5 . 35 . 2*(n- 2) (u« + 17 q- 13893# + 580-i01ra» - 6131014# 

+ 14132268n- 12932920) 

{n+ 6)TY^17) (n+ToH^Lf^ ' 


(1-2) 



R. C. Geaby 


69 


E. S. Pearson (1931, 1936) derived empirically 0-06 and 0-01 probability points for certain 
values of w ^ 25 using a Pearson Type VII curve and earlier approximations by R. A. Pisher 
(1929) of the second and fourth momenta. 

The method here used for the derivation of the frequency distribution of is essentially an 
elaboration of that which the author used (1936, 1936)forfindingthefrequency distribution of 
the test of kur tosis a (the ratio of the mean deviation to the standard deviation of the numbers 
sampled), which consisted in establishing a relation in integral form between the frequency 
ordinate for n with the value for (w— 1) and thereby determining the ordinates to any 
required degree of accuracy for the lower ti’s. At a certain stage the actual frequency is shown 
to be very close to the value based on the Gram-Charlier curve for the same value of n; and 
the assumption is made that the Gram-Charlier may be relied on for values of n greater 
than the ‘ transition value ’ . In the present problem the known normal moments are utilized 
as well at every stage. In the concluding section the status of the solution in the hierarchy 
of ‘precision’ is discussed. 

Since the frequency is symmetrical, attention is confined practically exclusively to the 
positive sector. 


2. The general integral iteration 

To distinguish the sample size by the notation let the value of be indicated by Apply 
a Helmert orthogonal transformation to the original observations Xy,x^, x^^ so that 


= {Xy-x^)l^2, 
^2 = (!®1 + 1^2 ~ 


+ = x^n. 


which, on inversion, gives 


11 

1 

x{ x!. 


x^ — x — - 

Xy Xa 

•••+;/[«(w-i)]’ 

x^ — x = - 

2x' 

-,/6 




(»-2)a:;_2 


Sn-l)(n-2)] 


+ 


"n-l 


^[n[n-l)y 


VWn-l)]’ 


1=1 i— 1 


( 2 - 1 ) 


( 2 - 2 ) 


(2‘3) 


Then 



70 . 


Frequency distribution of Jb^ 


+.W2/ I ^5 I I ^n-1 \ 

^[f20^f^0^-^^l(n-l)n]) (2.^j 

'^f[{n-T)n] 

V6 V12 V20 - V[(»^-l)n]- , 

Apply a polar transformation to the a:.., that is, 

x[ = r sin sin ... sin 5^^ sin (;io, ■ 
ojj = sin ... sin 01 cos0o. 

a;; = rain0.,j_3 sin0„_i ... sin02 cos0i, 
x\==r sin 0,j_3 sin 0„_i ... sin 03 cos 02, 

; [ (2-6) 

<_3 = rsin0,j_3 sin0.„_i cos0,,_5, 

<_2 = rsin0,,^_3 oos0„_4, 

<_i = rcos0„_3, 

and ~ = r®, /2-6) 

(3 ' ^ 

~ sin^0n-4 -r- sin®0i sin^03 cos 0 q 

3 

.•• sin3 02 sin* 01 cos0i 

^ V[(w-2) (w-l]] 9^n-4 cos 0,,_4 + sin* 0„_3 OOS 0„_3 

- ^ sm3 0„_3 Sin3 0 ,^_^ ... cos* 03 - ^ sin* 0.,,_3 . . . sin* 0^ cos* 0i 

■ ■■ ■ ";^W=n3] ^-3 oos*0„_i - cos* 0„_3j , (2-7) 

whence the fundamental iteration 

S"(«-l|l“" **’-■■*■[(»_ cos'#,.,, (2-8) 

m whi* to intoM only the angle and for normal random .ample, it ie a well- 
to >»*p.tontly ofone another, the di.tributio„ of 

(2-9) 

Now «„_i inyolves only 0„, ,..,0^_^. hence it is independent of 0 , Aocordinrfv if the 

frequency distribution of is of the form ^ rdingly, it the 

/ ii-l(*n-l) (2-10) 



R. C. Geaby 


71 


the joint distribution of ^ 
Now, from ^2-8), 


„_3 and is given by 
(7 sin’*' 



On substituting in (2-11) and integrating we find for frequency of the expression 
fni^n) !•(«, — 4 ) ! i>n-ifn-l{^n~x) > 


( 2 - 11 ) 

( 2 - 12 ) 


(2-13) 


where the relation (2"8) obtains. Integration extends to values of 0.,^_3 (so that 0 ^ n 


for m>3) which yield non-zero values of /„_j. Setting C08 9i„_3 
assumes the form 


X the integral at (2-13) 


with, from (2'8), 


nHn_i — [{n- l)it^ — dx+{n + l)x'^]j(l—x^)^. 


-i)i 


(2-14) 

(2-16) 


In the derivation of the frequencies for n = 4 to 8 inclusive, dealt with in later sections, both 
the forms (2-13) and (2-14) are used. 


3. Functional discontinuities oe the erbquenoy 
111 the integral at (2-14) appears merely as a parameter. Consequently the nature of the 
frequency /„(!„) depends to a considerable extent on the simple algebraic properties of 
tn-i(x) given by (2-16). The following property (easily demonstrated) is fundamental; 


For K=in-2kmn-k)]i=^krn {k=l,2,...), (3-1) 

has a maximum value of 

{n- 2k^l)l[{h-l){n-k)f = (3-2) 

for x = ~ [{n-k)jk{n-l)]i = (3-3) 

and a minimum value of 

(n-2k-l)l[k(n-k-iy]i = (3-4) 

for , a; = [k/in- 1) {n~k)]i = (3-5) 


Definition, are termed the link values or links of The regions between consecutive 
links are termed zones. The graph of t^^iix) for — -t- 1 and (given at (3-1)) 

is illustrated in Fig. 1 . The limits of integration for integral (2-14) are now seen to be and 
j,A"i which are the values of x at which the ordinates of the curve (2- 16) in (a;,t„_i), with 
parameter — jJn’ assume the limiting values and The scale on the right 

shows the links of t^-x- The curve traverses aU the zones but has a ‘ turn ’ in the 

(1:— l)th zone, remaining entirely in the zone the while. It is due to this turn that the 
phenomenon of functional discontinuity manifests itself in the frequency /„,(^„). 

Assume that within the A:th zone the frequency /,i_i(t„_i) is represented by kfn-iiK-i)’ 
different in functional form for different values of k but the same (for example, having the 
same qoeffioients in a power series) within each zone. It will at once be evident, from (2-14), 
that the frequency of will have a like property. Now, from (2-4) and’(2- 6) it will be seen that 



72 Frequency distribution of Jb^ 

the diatribution of is rectangular, so that the distribution* of is given by 

and zero for | ig | > 1 /V^- If follows that has a functional discontinuity at its links ± 1/^2. 
Hence, by iteration, the frequency oft,,, is represented by different functional expressions in its 
different interlink zones. 


Link of 



rig. 1. Graph of 


That the frequency has a finite limit (when » is finite) is established as follows. It can 
easily be seen from (2- 15) that when = jT„ the curve = t„_^{x) degenerates into (i) the 
straight line a: = — 1 and (ii) a section above the straight fine but touching it. 

J’oTi„>iT„nopartofthecurvet„_j. = falls within the rectangle a: = ± 

Reference to (2-14) shows at once that, if/,,_i({„_i) = 0 fox j then/„(l„) = 0 fox 

\^n I shows that t^ has as limiting values + ljf2. Hence, by iteration, it 

follows that the limiting values of the frequency of (or simpliciter of i„) are 

±lTn-X = ±(»-2)/(«,-l)i. 

* R. A. Fisher (1930). 


(3-8) 



R. C. Gteaey 


73 


As will presently appear, the frequencies for ?» == 4 and 5 have marked irregularities: 
successive integration in accordance with (2-14) imparts, of course, a progressively increasing 
degree of smoothness to the frequency. To give mathematical expression to this feature, 
recourse is had to the idea of order of contact. 

Definition. Two functions are said to have contact of order at link if the functions 
and their first — 1 ) derivatives are finite and equal at the link. It can be shown without 
difficulty that 

k7n — Ij 

when k>l, n> 4:. For what follows it will be convenient to set out for the smaller sample 
numbers the values of the links and their orders of contact. The links for positive values 
only of the variables are shown. The orders of contact will appear from a proposition 
proved in § 5, giving the actual values of the frequencies near the limit of range. The non- 
diminishing smoothness in the direction of the centre of the range will be noted. 


Values of and kYnf^ n = 3 to 8 inclusive 



Isfc link 

2nd link 

3rd link 

4th link 











I'^n 

lYn 

sTn 



sYn 



3 

1/V2 




__ 

— 

„ 


4 

2/V3 


0 

0 

— 

— 

— 

— 

G 

3/2 

1 

iWe 

1 

— 

— 

— 

— 

6 

4/V6 

1 

1/V2 

2 


2 

— 

— 

7 

G/V6 

2 

3/^10 

2 

1/2 d3 

3 

— 

— 

8 

6/^7 

2 

2/V3 

3 

2/^16 

3 

0 

4 


For even values of n the origin is always a link. In the determination of the frequencies 
for w = 5 to 8, by the methods described in subsequent sections, the link ordinates and the 
central ordinate play a cardinal role. In fact, the method will be seen to consist essentially 
in finding curves which pass through the central and link ordinal points, have the required 
orders of contact and the required form at the limit of range and have the exact earlier 
momental values (see first section). 

4. The frequency near the centre of range 
It will first be shown that 

/k( + 0) — 0 for 92, >4. (4'1) 

In fact, from (2-14) and (2-15) if t„ — u,s, small positive quantity, 

C+A~KU-k“ 

fju) = 0 dx{l-x^)i(’^-'^f„_Tlt,,_i(x)l 

J — A-k«=A.' 

A, /c being positive constants. Hence 

f:,{u) = - ^^{{l - - (1 - 

















74 


Frequency distribution of Jb^ 


Letting u->0 the integral-free expression obviously vanishes provided that/,j_j[<„_j^(A)] is 
finite, which it is when m > 4; and the integral becomes 





Since /,i_i(2/) is an even function of y, its derivative is odd which remains an odd function 
when y is replaced by an odd function of a;. Hence the integral vanishes. 


6. The form of the fbbqttbncy at the limit op range 
In this section it will be shown that nmr = ± (« - 2)/(n- l)i the frequency is given by 

fit)- 1 

3V7ri(w-4)!(3n.»-2)«"-‘) U-1 "/ 

It may be seen at once that for w = 3 the frequency by (5-1) would be 


( 6 - 1 ) 


Mk) = ~a-tl)-K 


(5-2) 


as at (3-7). Eor % = 4, (Sd) gives which is the value found by A. T. McKay (1933). 
The general theorem will be proved by iteration. We assume a general form 

\ l(n-E) 

fn-lik-l) — 

and show that a similar form emerges for/„(t„), .finding incidentally an iteration relation for 
the constant 0^. First set 

V = n — 2 — n — l^k> 

and assume that d is a positive quantity. It will readily appear, from (2-16), that, for t; = 0, 
^ double root at a; = l/(?i — 1). Accordingly we set 



x = x' + ll{n — l) 

(6-4) 

and 


(6-5) 

Having regard only to principal terms we find 



1 -2) 

(71-1)^’ 

(6-6) 


X 2n-^ln - 1 )3 (» - 2)-2 («, - 3 ) ( 1 - x"^) , 

(6-7) 

with 

from (2'14), 

(5-8) 

Now, 






and, from the analysis in § 3, it will be clear that there are two separate parts of the domain D : 

(I) a part near x = !/(?},— 1) for which is entirely in the first zone and by hypothesis 
has the form (6-3); 

(II) a part near a: = — 1 ip which assumes all values. 

The symbol signifies ‘equals, to requhed approximation’. 



Let 


R. C. Geary 
UQ -mtn)+mQ. 


75 

(6-9) 


where the functions on the right represent the contributions accruing from the respective 
parts of the domain of integration. Then 

X {n — 2)-^ [n— B) {1 —x"^) 

which, on a change of variable from x to x" by (6-4) and (5-8) and integrating in x", becomes 

(^_ l)«-3 (^_ 2 


:2 


(5-10) 


which is of the required form. As regards the contribution of II, in {2-16) set 

«+ 1 = + 


( 6 - 11 ) 


It will be found that when has its limiting value + (w — 3)/(n — 2)^ the vanishing of the 
terms in and gives 

/VI ^ 2/2 {n^B) 

fi-in and „ _ 

whereas if has its limiting value — — 3 )/(tc — 2)i the values are 

a 1 A 2/2 (n-3) 

B = \n and w = — - 


Between the limits of x, 

Bn~ ^^Bn.^i‘th — 2)^ 

given by (2-15), assumes once all values between its limits of range and, in fact 


, V 2 12 n—B j 
, _ / ______ , 


(5-12) 


Now 


fni^n) = 


{-)' 


y^- 




3 ■ 

(5-13) 

da;(l-a;2)*(»-’)/,,_i(i„_i). 

(6-14) 


^7r^n-4:}\ 

the limits of integration in iB being given by (6- 12). By (6-11) and(5-13) change the variable 
a: into (via y) when (5- 14) becomes 

mtn) ^ 0(n) (5-15) 

and the integral on the right is unity. Written in full (6'15) then becomes 

\Un-i) 


fniK)'- 


1 i(%-3)! (ti-1)««-3) (n-2' 


3 y/vr n - 4) ! (3 to TO _ 2)«' 


l)l(«-3) /to -2" A* 


(6-16) 


There is no difficulty now in proving by iteration from (S-IO) and (5- 16) that the constant 
has the form indicated in (5'1). Note that, in (5-9), accounts for {n — l)/n and/^ for Ijn 
of the total frequency. 



76 


Frequency distribution of Jhi 


6. Samples oj- 4 

From (2'14), (2'16) and (3-7) the frequency for w = 4 is found to be 

■^vllel'e y = 2(l-a:^)3-(n-3a; + 6a;®)2 with v - (6-2) 

D is the range of values of x which give non-negative values for y with | a: | < 1. Now 

y = -(3a;2-l}2(3a:2-2)-2u(6a;®-3a;)-v2, (6-3) 


from which it appears that when v is small y has two real roots near — 1/^3, two imaginary 
roots near* and single roots near and -^2/^3 accounting thus for all six 

roots. With 0(v^) = 0 the four real roots are 

2 




with a® „ 


9^3 


[2_v 
- V 3“9' 


j'-Vi+vl3 

f — 1/V3— avi 


rilV9+avi f 

+ 



+ 

J _yj_„/9 J 

-VJ+W9 

— i/vs+fltui »■ 



Hence the integral at (6'1) may he written as the sum of five integrals 

Wi-via 
l/v'S+auf 

Fig. 2 illustrates the division of the region of integration D. There are five divisions, 
numbered I-V, in what follows, in which we regard as ‘principal terms’ only those in log v 
and the constant term. Terms in will ultimately be ignored: 

■-Vi+t);9 

0(nt), 


-i: 

ii=j 


Vi-Vl9 
-i/s/a-OTi 

-Vi+-vl9 


da:( — 1 -1- 3a:*)~^ (2 — 


■ VJ-u/O 
l/V.'i + a'ui 


=V. 




IV 


+ l/V3-«Di 

J -l/va-au’ 

l/V3+aul 


' llV3-avi 

Neglecting 0{v^) we accordingly have 




^ + 1)~1 dx' = ^ sinh-i 1 . 


I -I- II -t III -f IV -t-V^ -^logSa^v-l- JgSinh“^ 1-I-2C'. 
The constant 2G derives from additional terms in integrals II, III and V: 


(6-4) 


III = 


with 


+y 

c 

-y 


dx{(l - 3a;2)2 (2 - Sa:^) - 2v{5x^ - 3a:) - 
y = lj^JS — a.vi, 

* Tn a sense which will be obvious from iFig. 2. 



R. C. Geary 


77 


We have already taken account of the term in III found when v is zero. The constant C 
derives from the even powers of z in the formal expansion of the denominator of the integral 
element — the qdd powens vanish by symmetry. Setting 

* = ^ 3 - 3 '', 2-3a:2=l, 6*3- 3*^ 

On expairsion of III, 

CO r 1/V3 
k-^lj avi 

where G^k ^^he even-order coefficients in the expansion of ( 1 H- z)~-, i.e. G^j^ is the coefficient 
of On integration we are interested only in the value at the lower limit av^, for all 
terras at the upper limit (and certain terms at the lower limit) are 0(v^) at least. Hence 



Note. This diagram is designed merely to give a general idea of the limits of integration. It is not 
drawn to any scale. Following are the values of the 


Si = Vt 

Si = V|-h'u/9 


£2 - 1/V3 

^2 = Ij^Z+CLvi 


= 2/(0 V3) 


Similarly II-l- V also yield 0 giving a constant additional term of 2G. 

Now 0 = 4 = + + 

0 


^3 ;c=i 4fc J 0 

= 4 f ' f '^(1 -i-*v- + (1 -xv} 

^/3j 0 X 2y3J (1 X 


_ j_~ 

“V3 _■ 


logx-(-ilog 


(iH-a:^)*— 1 

(1 -f- **)*-)- 


i + ilog 


l-(]-*3)r 

l-(-(l-*2)t 


0 




(log2 + ilog (2i-l)}. 


a;=l 

a :=0 


( 6 - 6 ) 



'jg Frequency distribution of Jbj^ 

All logs are to base e (unless otherwise indicated in what follows). Hence 

0-372646 -^^logu (6’6) 

= 0-285222 -0-366466 login I <4), (6-7) 

since v S 2 

A. T. McKay (1933), from a different approach, gave the log term in (6-7), and, as a rough 
approximation to the constant term, the value 0-311668. He also showed that an expression 
of the form (6-7) accounted for most of the frequency, a fact of great importance. Assume 
that the residual term is of form 

A j <4!, 

and find A and B from 

(i) /4(2/V3) = (from (5-1); also McKay (1933)), 

(ii) total frequency is unity', 

giving 

fSi) = 0-286222-0-366466logiQ|t4l-0-009178lf4li+0-031359 Iti). (6-8) 

For algebraic manipulation at the next stage the form of residual A' | {4 ] + J3't| will be found 
more convenient, however, with A' and B' also determined from (i) and (ii). In this form 

fSi) == 0-286222- 0-159155 log ( fi ) + 0-014275 1 | + 0-007398t|. ’ (6-9) 

Note the smallness of the coefficients A, B, A' and B' in (6-8) and (6-9). 

In the following table the first four even moments as derived from frequencies (6-8) and 
(6-9) are compared with the actual values as derived from the formulae (1- 2) . Both formulae 
yield excellent approximations, with (6-8) always superior to (6-9) however. Either formula 
can obviously be used with complete confidence for deriving the probability points. The 
frequency graph in Fig. 4 is derived from (6-8) which should also be used for the computation 
of the probability points. 


Moment 

Actual 

Formula 

(6-8) 

(6-9) 


0-342857 

0-342930 

0-342470 


0-238941 

0-268979 

0-268606 

N 

0-240603 

0-240263 

0-240206 

/«8 

0-245940 

0-246949 

0-246662 


7. Samples op 6 

After many computational experiments the method used for determining the frequency 
fiih) as follows: 

(1) Using (2-14) with form (6-9) fovf,(t^), central and Unk ordinates, i.e./5(0) and Ml If 6) 
were computed. 

(2) The approximate value off^M near was found in the form 

/8(1/a/ 6) f Af(<6-1/V6)t, 


M being known. 







R. C. Geaby 


79 


(3) The two zonal curves were found (i) passing through (0, /5(0)) and (l/^/6,/5{l/V6)) 
with/g(0) = 0 and (ii) passing through and with the required form at 1/^6 + 0 

(i.e; as at (2) above) and at the limit (| — 0) so that /fj ( = 1 ), /ij, and /tg have the exact values 
as given foin = 5 by the formulae (1‘2), 

Setting then 

Uh) = 0-286222 -0-159156 log, j t, \ + B(t,), (7-1) 


with 

and 


_ 6 

= 0-014275 \ti\+ 0-007398*1, 


(7-2) 

(7-3) 


we have 



(7-4) 


the limits of integration being A (negative) and /i (positive) which are the values of x, from 
(7-2), corresponding respectively to and We shall be concerned only 

with the case *5 < 1 when t^{x) has three real roots /?, a and y of which /? is negative and a 
and y are positive. For (7-4) the following are required 


with 


and 


/'/'da:log(l -x^) 


l+*tl-A \ 

1 + A 1 — /ij * 

= |:{log^ ( 1+ A') - ( 1 + A) - log® ( 1 - a) + log® ( 1 - A)} 


+ |{log (1 +fi) log (1 -/i) -log (1 + A) log (1 - A)} 


- ilog log (1 + a) (1 +/9) (1 +y) -f ^ { ,|/(<^) + J (^,-)). 


Ky = (a-A)/(l-a), 
= (/i-a)/(l + a), 
== (7-A)/(l-y), 

Ky = (/t_y)/(l+y), 


/C5 = (/^-A)/(l-/ff). 

'c„ = (A-M1 + A 

Ki = (a-A)/(l + a), 

ATg = {/i-a)l{l-a), 


^^9 = (r-A)/(i+r). 
^io = (A-r)/(i-y). 
^11 = (^~ A)/(l +^), 
^19=(A-/?)/(1-A 


7(x)=r ^^^ = logxlog(l + A:)-^(/c), 
J n 1 

Jo 1 -* 


(7-5) 


(7-6) 


(7-7) 


- p+ fa"*" '32 ■*'■■■ 

. ^2 i^3 

If, ^ /v 

l^('f)= p-^2 + P'”- 


when AC ^ 1 . 



80 Frequency distribution of 

It is useful to note that ^4(1) = 1'644934 = 2^{1). The functions ^{k) and v!^(/c) do not 
appear to be tabulated. By fitting curves to their values for equally spaced intervals of- 0-05 
from 0 to 0-6 the following very close approximations are found, applicable for x < 

<j}[x) = l-000567a! + 0-233464cc* + 0-186052a;3, 

f{x) ~ 0-999835a:-0-244220a;« + 0-077024a;». 

When 1 3 ! /c> I the following formulae can he used: 

<}){k) =^(l)-log/f log(l-A:)-9i(l'-/c), 

f{K) = |9J(l)-l-logK:log(l + Ai:)-^(l-«:) + i(l“^®)- 

When /c> 1 we use 

^{ k ) = 2$i(l)-|log*l/Ac:-5i(l//r), 
f{K) = 2f(l)-f ^log2l/A:-v^(l//c). 

Another useful formula is 

The algebra of the contribution to (7'4) from B{tn) is without mathematical interest. 
From the formula the following were the values found for the central frequency and the 
second link frequency: 

/5(0) = 0-606663; /^(l/Ve) =0-599069. (7-8) 

The moments, computed by (2-1), with n ~ 5, are 


= 1, /ij = 0-376, /i4 = 0-361607, /fg = 0-474609, fig = 0-719382. (7-9) 

Computation by approximate integration of certain of the ordinates gave evidence of 
marked irregularity near the link tg = 1 jfQ. In consequence, it seemed desirable to tr}'- to 
And a term (in addition to the constant given at (7-8)) of the expansion of /^(tg) near 1 + 0. 

Setting 1 


^/6 




where t is small and positive — we shall be interested only in a term in — we find 


.Mh)- 


4 


I’D— yid 

+ 

J A^-A7. J v-{-At 


dx 


i|l. 


tfiih 


(7-10) 


(7-11) 


The values v ± At^ are the abscissae of the points at which the curve = t^{x), given by (7-2), 
intersects the link line == near x ~ v = — ii/6. It can easily be shown that 




12’ 


(7-12) 


We are not concerned with the values (A + A'<) and {/i + y't) which are the abscissae corre- 
sponding to the intersection of = t^{x) with, = — 2/.^3 and its third intersection with 
ti = 2/a/ 3. Remembering that at the latter link/ 4 (« 4 ) has the value 1/(2 .^3), the integral- 
free term in <-i in the first derivative /^(fg) oifSs) is 

~\At-i^f^{v + Ati)-A --MH} = (7-13) 

Also we have to con.sider the integral term in/ 5 (i!g). For this purpose, from (7-10) and (7-2), 


6 

^6 




+ 3 (l-a:^)- 







R, C. Geaby 


81 


Remembering (7-1), it can be shown that the only term in (7'11) from which a term in 
can come is approximately 

4 1 fr dx 


4 1 r;* dx . 11 1 tv 

n ^5 27rJ ^ 1 - a;® ((* ^6 9/ 


4*‘i. 


(7-14) 


A and /i, the limiting values, being respectively negative and positive. 
Differentiating (7-14) in respect of t we find 




Changing variables by 


/I 

■(V6'''9) ~ 3 ^^ 


and letting t tend towards + 0, we find for the term in 

2 /7r6b 

7T^^J5\ 6 


Adding (7-13) and {7-16) we find 




(7-16) 


n 

U 


(7-17) 

(7.18) 


On integrating we find for the term in 

From (6-1) the value of/g(a!) near aj = | is 

0-219166(|-a:)i, 

where x is usually written for simplicity instead of in the remainder of this section. 

Having regard to (7'8) and to the fact that, from § i,fi(0) = 0, in the half-zone (0 — l/.y/6), 
f^{x) = F{x) must be of form 

F{x) — 0-606663 -I- 

The first relation between the coefficients is found by giving expression to the fact that 
y = F{x) passes through the link-point (l/.^6, 0-599069): 

0-166667a2-l- 0-068041(13 -f 0-027778a4 = - 0-007494. (7-20) 

In the zone (l/V® “f) assume that 

l\i 


f^{x) = G{x) = -0-594117 


^4)‘ 


-p 0-219166(1 - x)^ -P ho + ?>i(| - *) 

+ 63(|-a:)®-p63(|-x)3, (7-21) 

designed to conform with requirements (7-17) and (7-18). Since y = G{x) must pass through 


(1,0), 


bo = 0-694117 


(*4e)‘- 


620776. 


(7-22) 


Taking the value of bo into account and giving algebraic expression to y = G(x) passing 
through (I46, 0-699069), we find 


l-0917626i-P 1-19192262-P 1-30128363 = -0-260706. 


(7-23) 


Biometrika 34 


6 



82 


Frequency distribution of Jb-^ 


To find the six coefficients a^, a^, 0,4 (in (7'19)) and 61, b^, 63 (in (7-21 )), we have, so far, found 
two equations, (7'20) and (7-23), The remaining four equations are found by equating the 
total frequency to unity and the first three even moments to their true values given at (7-9), 
i.e. setting 

rrlvs C‘ 

dxx^>‘F{x)+\ dxx^G{x) 

Jo J 1/V6 


{k = 0, 1,2,3). 


On substituting for given by (7-20), for bo given by (7-22) and for bg given by (7-23), we 

find the four equations in a^, Ug, b^, b^: 

0-os90721aa+0-0n39999a3 + 0-297981bi_+0-108441b2 = -0-071173,' 

0-0364802ffip+0-0ni0237a„ + 0-268332b, + 0-082590b2 = -0-066293, 

^ ® (7-24) 

O-O^SOOlttj +0-on07l76a3 + 0-3032486i+0-079086b2 = -0-076019, 

0-056368aa +0-051169% +0-400090bi+'0-089674b2 = -0-101300. 


On solution (and checking by substitution) the coefficients are found to give finally the 
following frequencies; 


with X = 


Zone /5(a:) 

0-606563 - 0-3307 x* + 3-1955a;® - 6- 1 129a:^ 

1/V6 -f : 0-620776 ~ 0-6941 17 + 0-219166(| - x)i - 0-268273(| - a:) 

+ 0-067263(1- a!)2- 0-029195(1- ®)3, 


(7-26) 


The extremely interesting form of the frequency curve may be observed from Fig. 6. In 
the first half-zone the frequency shows but little variation; the curve declines to a minimum 
of 0-6058 at a: = 0-0894 then rises to a maximum of 0- 6136 at a: = 0-3027 . It then recedes to 
the link l/^J 6 , where it assumes the value 0-6991. As one type of check on the reliability of 
the results in general, some ordinates were computed directly (i.e. using (7-4) and (7-1)), 
or by approximate integration using (7-4) and (6-8) and compared with the ordinates 
computed from (7-25) to the following effect; 


1 

Trial value 
oft. 

Value of frequency 

By approx, 
integration 

By (7-26) 

0-15 

0-6069 

0-6068 

0-3 

0-6106 

0-6136 

0-6 

0-3660 

0-3603 

0-9 

0-2232 

0-2308 

1-2 

0-1377 

0-1371 


Except perhaps for the frequency at = 0-9, the correspondence is satisfactory; there can 
be little doubt that the more accurate figures are those from (7-26). 

As a stringent test of the accuracy of the frequency the 8th moment /tg was computed 
from the empirical curves at (7-25) and compared with the actual value given at (7-9) ; ' 

/is = 0-7191. % = 0-7194. 




R. C. Geary 83 

Even as the figures stand the check is decisive: it should be added that the 4th place of 
decimals in //g is suspect to the approximation used. 

8 . Samplks op 6 

In this case the links are 0, lj‘^2 and 4/^/5, and the link frequencies at the first two were 
found by approximate integration using form (2-13) with given by ( 2 - 8 ). For this purpose, 
drawings were made of the two sections of . on a scale sufficient to ensure that an 
ordinate read for any abscissa would be correct probably to the 3rd place of decimals. For 
intervals of 1 °, values of were computed over the whole range by ( 2 - 8 ) (for given), and 
graphically the vahre of was read oif for each Hundreds of readings had to be made, 

but actually the work, with a little practice, was rapid and accurate, the entries being prac- 
tically self-checking. The Gregory formula (using 2 correction terms) was used to give the 
foUowing results: /^(O) =0'6889; /6(1/V2) =0-3247. (8-1) 

The two zonal frequency curves, say y = F(x) in ( 0 — 1/^2) and y - Q(x) in (1/^/2 — 4/^5), 
writing x instead of must have the following properties; 

(i) F(0) = 0-6889, 

(ii) F'(0) = 0(§4), 

(iii) F(l/V2) = (?(l/v/2) = 0-3247, - (8-2) 

(iv) F'(1/V2) = «'(l/v'2) (§3), 

(v) (from (5-1)). 

The curves were 

F{x) = 0-6889 -l-a 2 a:®-ba 3 a:®-|-a 4 a;^-t- Os*®) 

+ + with /? = 4:j.f5. 

The exact moments are 

= 1 , = 0-380952, /<4 = 0-409191, //.g = 0-642924, fig = 1-219892. (8-4) 

It is proposed to compute the seven coefficients in (8-3) using ( 8 - 2 ) and (8-4). Now, with 
a curve of the type offg{x), where much of the frequency is at the ends it is evident that the 
contribution from the zone (0 — 1 1^2) to the higher moments and /tg is exceedingly minute ; 
this property was utilized to divide the single series of seven equations into two series of 
three and four equations using the following device: Approximate F(x) by a curve Fi[x) 
given by Fj( 3 :) = 0-6889 -t- a' a;2, (8-5) 

finding simply by passing y = F.^{x) through (1/^2, 0-3247) giving a'^ = — 0-7284. If is 
the moment, let y'.^^ and y^g be the contributions from F{x) and G{x) respectively so that 
y-is = iM 2 s + /« 2 s- Let Vgg be the estimate, using Ffx), of y'^. For s == 2 , 3, 4 the values are 
v\ == 0-030318, vg = 0-009360, Vg = 0-003397, 
which, subtracted from the corresponding y^s given by (8-4), give very close estimates of 
yls, which involve only b^, bg, 64 . The equations in order 

(1) l-170178h2-f-l-265837h3-f 1-36931664 = 0-174500,' 

(2) 0-72698062 + 0-37894963-1-0-23550264 = 0-068771,- (8-6) 

(3) 1-20006762 + 0-53470263 + 0-28966364 = 0-091217, 
are found from (iii) at (8-2), from yl and from y'g. 



6-3 



4- 0- 176777^. = -0-3642, \ 


(8-7). 


jf'requency amnunnun- uj 

The equations in the ft’s are 

(4) O'Sftjj 4- 0-353563ftg + 0‘25ft4 

(6) I-414214ft2 4- l-Sfta 4- 1-414214(24 4- =—0-653179, 

(6) 0-117861ft2+0-0625ft3 4-0-035356a44-0-020833o6 = — 0-115790, 

(7) 0-035365ft2 4- 0-020833% 4- 0-012627ft4 4-0‘007813ft5 = —0-031433,, 
where (4) is from (iii) at (8-2), (5) from (iv) at (8-2), (6) from the total frequency = | and 
(7) from variance =%. 

The solutions of (8-6) and (8-7) yield the following frequencies: 

Zone Al*) 

; 0 - 6 889 - 1 • 27 1 Sa!** - 2 - 607 4- 9- 6669a:* - 6 • 7 7 9 Oa:® , 

1 1^2 - 4/V5 ; ^ (/?-*) + 0-047068(^5 - a:)^ 4- 0-024897(/? - xf 4- 0-064198(/?- a:)*, 

with P — 4/^6,j 

with X = t^. 

As a check, the 4th moment computed from the foregoing curves gave 0-4108 as compared 
with the actual % = 0-4092, an error of 0-38 %. This is not of any importance from the view- ■ 
point of the computation of the probability points, but it illustrates how, using the integral 
iteration method as generally in this paper, the momental check reveals increasing dis- 
crepancies with increasing n. 

It might be thought that by constructing empirically ‘ almost any ’ symmetrical frequency 
curve, so that say the 0th, 2nd and 4th moments have the .true values, we shall ensure that 
the subsequent even moments computed from such an empirical curve will approximate 
closely to the corresponding true values. That this is not the case may be seen by computing 
the 6th and 8th moments by the well-known Karl Pearson iteration formula,* where 
and % have their true values, for with n = 6 ; 


( 8 - 8 ) 



Actual 

Karl Pearson 

Percentage 


iteration 

discrepancy 

fk = 

fit — 

11-6291 

12-4934 

410-7 

57-9214 

73-3990 

4 26-7 


Even when, in the Pearson iteration for one gives JS^ the correct value, we find a per- 
centage discrepancy of 17-9. These percentages place in perspective the minuteness of the 
percentage errors found in using the higher momental check as it is used throughout this 
paper. 

It is an interesting question of general import whether in work of this kind the arduous 
and potentially erroneous computation (by integral iteration) of the central and link 
frequencies could he dispensed with, and reliance placed entirely on the moments, together 
with the functional properties of the frequencies, which, of course, merely represent an 
elaboration of the Karl Pearson approach. In this connexion a couple of experiments were 
made on the frequency for n = 6. 


♦ Tables for Statisticians and Biomctricians., Parti, 2nd ed., p. xi. 




R. C. GtBARy 


86 


For the first experiment, the two zonal curves were assumed to have the correct order 
of contact, the correct form, (8-2) (v), at the limit of range and the correct values of 
( = 1), Hi, Hi H' The equations are 
Zone 

0-1/V2 :Fi(a;) == 0-659844- l-07S618a;2+0-655991ir3, l 

l/V2-4/^6:Oi(a;) = ■^(^-a;) + 0-080660(^-a:)3-0-085469(/?-a:)3 (8-9) 

+ 0-133119(/J-a:)^ with /? = 4/^6.] 

This gives a central frequency 0-6598 compared with the computed frequency (by (8-1)) of 
0-6889. In all the circumstances the difference is not important. The 8th moment, hI, horn 
(8-9), is 1-217706, or - 0-18 % in error. 

The second experiment contemplated the frequency as a single-curve system with correct 
first derivative ( — at the limit and with correct /<o, h%, Hi> Hi- The curve is 

Fi{x) = 0-669426- l-51097a:3-M-63864a:3-0-60545a;* + 0-086157a:5, (8-10) 

which has the properties : (i) the central ordinate 0-6694 is close to the actual; (ii) limit value 
from curve scarcely differed from the actual since = 0-0015; (iii) from curve 

= 1-2237, an error of 0-31 %. 

All the systems (8-8), (8-9) or (8-10) yield probability points which differ very little. For 
instance, in the three cases, the 5 % point is given by 


System 

6 % probability . 

(8-8) 

1-0432 

(8-9) 

1-0386 

(8-10) 

1-0384 


The practical identity of the latter two is due to the fact that the frequencies were derived 
on very similar hypotheses: it does not mean that the result is more reliable than that, from 
(8-8) which, assuming the accuracy of the calculation of the link ordinates, must be deemed 
to be the most correct and is adopted for the iteration to the n = l stage. Nevertheless, these 
experiments convey the hint of general application that if we know (i) a number of moments, 
(ii) the limits of range and the frequency form at the limits of range,.-and (iii) that the amount 
of frequency- near the limits of range is not negligible, we will probably be in a position to 
estimate with fair accuracy the points of low probability. For this, however, hypothesis (iii) 
is essential; it has no value from the computational point of view if the frequency near the 
limits is negligible. This point is discussed further in § 10. 

9. Samples oe 7 

The functional properties of the curves at the stage are as follows. Let the three links be 
denoted by a, yS, y, so that 

a = 1/^12, /? = 3/^10, y = 5/^6. (9-1) 


Denoting hy x, set 


y = x—a., z = y-x, 


(9-2) 




g0 Frequency distTibution of 

and let the curves in the half-zone (0- 1/^12). and in the zones (1/^12-3/710) and 
(3/710-5/76) be denoted respectively by F(x), 0{y) and H(z). We then have 

(i) J(0) = 0-6781 = A, 

(ii) r(0) = 0, 

(iii) F{a) = 0{0) = 0-6870 = B, 

(iv) F'{a) = ^'(0), _ 

(v) ^'(a) = G"(0). 

(vi) Q{f-oL)=^H{y-^) = 0-1838 = G, 

(vii) <?'(A-a) = -ff'(r-A). 

(viii) H{z) = Z)#-f Ca^^H-CgZ® with D = 0-078091. 


The central and link ordinates .d, 5 and <7 at (i), (hi) and (vi), were derived by the Gregory 
formula from (2-14), using intervals of 0-01, 0-025 and 0-05 at different sections of the 
integral range. The equalities in the derivatives at the links are in accordance with order of 
contact requirements (§ 3). The first term on the right of (viii) is from (6-1) with h. = 7. 

Conditions (9-3) determine the form of the polynomials: 


F{x) = A + a^x^+a^x^^, 

G(y) = B + i2a2a. + 4:ai0i?)y + l{2a^+l2aia?)i/ + bsy^ + biy^, - 
H{z) = Dz^ + c^z^+CsZ^, 


(9-4) 


with X = 

F{x) is taken as an even function of x because it is symmetrical in the zone ( — 1/712 to 
+ 1/712). This should have been done in the case of « = 6; neglect to do so was not serious 
enough to render recalculation necessary. 

The moments used were: 


/tg =1, /ij = 0-375, /li = 0-421876, = 0-733487. (9-6) 

Using (9-3) in conjunction with jUq and (only) in (9-6) the following equations in the six 
unknowns (Zj, a^, b^, b^, c^, Cg were found: 





Left: coefficients of 



Right: 

Eqn. 







absolute 

no. 




64 


^3 

term 

1 

12 

1 

_ 




-13-112496 

2 

0-816667 

0-281316 

0-287607 

. 0-189766 

— 



- 0-403210 

3 

— 

— 



— 

1-193686 

1-304172 

0-094626 

4 

1-897366 

0-766233 

1-306833 

1-150028 

2-186118 

3-681056 

- 0-122438 

5 

0-229606 ■ 

0-069278 

0-047439 

0-026048 

0-434724 

0-366221 

- 0-122132 

6 

0-130637 

0-041871 

0-032192 

0-017836 

0-668438 

0-496634 

- 0-044366 


Approximations to F,G and H were found; 

i^Ka:) ^ 0-6781- l-238888a:*-t-l-754160K4, i 

Giiy) = 0-6870-0-646478^/-0-361808y2 + o■171129y3-t-0•347163^/^l (9-6) 

H^(z) = 0-0780912^-0-01731922 + 0-0884082®. 




R. C. Gbaby 


87 


These yielded estimates of the 4th and 6th moments as follows: 

/ti = 0-419712, n’^ = 0-720776, (9-7) 

differing by — 0-6 % and — 1-7 % respectively from the correct values at (9-6). These devia- 
tions were not serious from the viewpoint of probability-point determination. Nevertheless, 
it seemed worth while to try to achieve a closer approximation. This was done by finding a 
‘ corrector ’ ^^(a:) (not positive, like a frequency, for all values of a:) with the following pro- 
perties: 

(i) total ‘frequency’ zero, 

(ii) ‘ 2nd moment ’ zero, 

(iii) 95'(0) = 0, (9-8) 

(iv) <l){y) = ^'{y) = 0, 

(v) ‘4th moment’ =7*4— X = 0-002163., 

Then = 0-002404 -0-028863a:2 + 0-045232a:S-0-024236a:*-(-0-004342a:«, (9-9) 

and the frequencies finally adopted are 

O-I/V 12 ... F{x)^F^{x)y(l>{x), 

I/V12-3/VIO ... G{y) = + 

31^10-5/^6 ... H{z) == Hi{z) + (/>{x),, 

Fi, Gi and being given by (9-6) and x = t^. It is evident from the smallness of the coeffi- 
cients of ^{x) in (9*9) that the correction effected by ^(a:) is minute. From (9- 10) the moment 
/il is 0-728972, so that the error is reduced to about one-third of what it was using Fi, 0^ 
and Hi- 

10. Samples oe 8 

The links and link frequencies are as follows : 


Lmk Link frequency 

0 : 0-6927 = A, 

j] = 2/y'l5 : 0-4442 = B, 

y = 2/v'3 : 0-1018 = G, 

: 0-04019153(^-a:)2 = D((J~a:)2/2, 

where x = ig. 

Set 

y = x-p, 
z = d — x, 

^; = y_^ = 0-638303, 

A = = 1-113086.. 


( 10 - 1 ) 


( 10 - 2 ) 


The orders of contact (§3) entail the following forms for the three zones: 


Zone 

0-2/,Jld:F(x) = J + a-ij+a^j+ai^, 

2l^jl5 — 2j.j3 : G{y) — R-b + + + + 

2/V3-6/V7 :Hiz) = + + 



gg Frequency distribution of Jbi 

rive of the seven equations required to determine the a, b and c will be found from the 
order of contact conditions, as folWs: 

p 

(i) 5 = ^+%-% +03— +<*4^. 

/ « B\k^ , 

(ii) G = B+laay?+a3Y + «4 0| + |^^2 + ®3^+®4“^j 2 6 ”^^*24’ 

A® A® A* 

(iii) = D-^ + c^— + CfY^, 

m p I B\ , , AC® A® A® 

(iv) a2/&+a3~ + a4^ + |^Oa + a3/3 + a4-2 j'f + ha-g +^>4-^ DA-Cg ^ 04^, 

02 ^2 A® 

(v) a^+a^^ + ai^ + b^K+bi— = D + c^X + c^-^. 

The remaining two equations were found by equating the 0th and 2nd moments from the 
curves to the true values 1 and 4/11 respectively. The frequency functions found were as 
follows: 

F(x) = 0-6927 -0'320142a:®- 2-7761a;® + 3-08a:^ \ 

0(y) = 0-4442-0-8541771/ +0-308677i/M-0-649680i/®-0-553667i/«,i (10-4) 

H{z) = 0-040192z® + 0-027763 z® + 0-0089332S J 

where x = ^g. 

Tor reasons which will be apparent in the next section, it was not deemed necessary to 
apply higher momental checks in this case. 

Reference may here be made to yet another experiment, the negative result of which 
may have some interest. At the w = 6 stage the remarkable ‘regularity’ which the curve 
assumed, after its highly bizarre appearance at the stage before, suggested that orders of 
contact (except at the limit of range) might be ignored at a slightly later stage and a single 
curve fitted using the moments only. 

Using /i„, /tj, and/tg, and theD(5— z)®/2 (see (10- 1)) for the forms at the limit of range 
with j^i(O) = O the following frequency curve was found : 

jPi(a-) = 0-040192(5-a:)2 + 0-132866(5-a:)®-0-293716(^-a:)4 + 0-23H46(5-a:)® 

- 0-039279(d- a:)® - 0-005209(5 - a:)^ (10-6) 

The correct values of the moments (to 6 places) were 

/to=l, = 0-363636, ^64 = 0-414644, /tg = 0-763334, /ig = 1’823617. (10-6) 

The value /J.^ of the 8th moment computed from the curve was 1-993270, an error therefore 
of +9-3%. The central ordinate ^^(O) = 0-9017 as compared with the actual 0-6927, so 
that the curve Fx\^) oo'^ld not validly be used for further iteration, since the frequencies near 
the central frequency would he considerably in error. The probability (computed from 
(10-5)) for F^ix) beyond the ‘true’ 5 % probability point (computed from (10-3)) is 0-0466 
which is quite accurate enough for practical purposes. This concordance, unexpected in view 
of the other facts mentioned, is due principally to the fact that Fi{x) has the correct form at 
the limit of range. This experiment shows that, despite the regularity of the .Jhj distribution 
for n, = 8, the problem of finding the nearly exact distribution cannot be treated in cavalier 
fashion. 



R. C. Geaey 


89 


1 1 . Probability points bob preqxienoies bob samples of 8 ob mobe 

By the Gram-Oharlier theorem for symmetrical distributions under general conditions any 
ftequency /(it!), where w has mean zero and variance unity, can be expanded in the form 


f(w) = exp 


^(A]\ ^8 (d 

^\Xl\dwj ■^‘eiAiUW SlAtW 

0{w) = 


where 


+ ... 0(w!), 


(IM) 




exp — 


the A being semi-invariants of the original variate. Let w be a normal variate with mean 
zero and variance unity. Using the method of E. A. Cornish & R. A. Fisher (1937) their 
expression for w in terms of u has been extended to the following effect : 


w — u- 


24Ar« 


JL 

384At 


2/6- 


A« 


*6 + 


A| 


720Ai ® 


3072AI 


2/7- 


A.A, 


4 ''^6 


1152AI 


4032A* 


x,+ ..., (11-2) 


where the x^. are Her mite polynomials in u of the degree indicated. The and terms in 
(11 ’2) are as follows: 

X3 = —u^+ du, 2/7= 461^3 -321m, I 

1/5= 3m® — 24m® -I- 2911, Zi~ u’ ~ 17m® - h 69 m®— 57m, i (11'3) 

= — M®4- 10m® — 16m, x^^-vP-^ 21m®— 106m® - f 106 m. j 

At (11-2) the expansion is taken to 0(n~^) because A^JAf is 0 (m“*+^) when the A are 
semi-invariants of 6^ for samples of n. 

The Xii., yj and Z/ functions at various probability levels are as follows: 





Probability points 


Function 












0-10 

0-05 

0-026 

0-01 

0-001 

u 

l-28i662 

1-644864 

1-969964 

2-326348 

3-090223 


1-739867 

0-484338 

- 1-649229 

- 6-610905 

- 20-239364 


- 2-97984 

- 22-98240 

-37-09066 

- 30-28992 

-f 226-9286 


- 1-632248 

7-789164 

16-986942 

22-868797 

- 33-068481 

2/7 

136-1309 

194-9663 

-22-4508 

-676-7597 

- 1286-263 


19-09291 

41-19969 

27-20947 

- 63-46639 

- 261-9424 

X, 

- 19-6234 

- 74-2936 

-88-4883 

- 16-6762 

362-6626 


For M = 8 the semi-invariants, etc., required are 


A,- 

0-363636, 

AJA| = 

0-1357, 

A4 = 

0-017960, 

Ae/A| = 

-1-1612, 

Ae = 

-0-055836, 

A8/A| = 

2-6577. 

As = 

0-046470, 




If the formula at (11-2) were quite correct and then if we computed, at any probability 
level e, the value of w, then set x = A|mj and from (10*4) computed the probabihty frqm end 
of range the result should be exactly e, assuming, of course, that (10-4) gives the exact 




90 


Frequency distribution of 

frequency distribution. When this procedure is carried out at different pseudo-probability, 
i.e. the probability of a;, levels indicated, the following results are found: 

Pseudo-probability 0-10 0-06 0-026 0-01 0-001 \ 

(fl) True probability (to ®=a|w) 0-098866 0-060469 0-026825 0-011604 0-001166 V (11-6) 

(b) Normal probability {to x^4u) 0-095564 0-052576 0-029502 0-013419 0-001090) 

The correspondence at (a) is obviously satisfactory. At first sight it might appear that at 
0-01 and 0-001 levels the divergence is (by the standards of this communication) rather 
marked. Actually this is not the case considering the fantastic difference in the algebraicform 
of the Gram-Oharlier and the actual frequencies near the limit of range. The probabilities 
at (6) show that the normal curve gives quite a good representation. At » = 8, however, the 
comparison flatters the normal curve since, as R. A. Fisher (1930) has shown, the ratio 
A4/AI actually assumes its normal value of 3 at w = 7 and reaches its greatest value at -n. = 22. 

We now propose to take a step which is discussed in some detail in the final section. We 
shall endow the right side of (11-2) with a remainder term which will make the probability 
of M) formally the same as the pseudo-probability at (11-6). The following table shows the 
value of the variate t^cr-'^ computed from (10-4) (where x represents tg,) at different true 
probabiUty levels, together with the corresponding value of w computed from (11-2); 


Probability 


w 

E 

0-10 

1-263173 

1-273231 

-82-2 

0-06 

1-671682 

1-666648 

21-0 

0-026 

2-043181 

2-008014 

143-3 

0-01 

2-446126 

2-388694 

227-6 

0-001 

3-107977 

3-079696 

116-8 


It has been seen that the difference between w as given by ( 1 1 • 2) and the true value is 0 (n~^) , 
Accordingly the values of B were found at the different probability levels by setting 


B tg 

10 -^—. = ^, 
■nr (T 


( 11 - 8 ) 


with w = 8. The estimates of the probability points P for values of > 8 are accordingly 

p = (n.9) 

the values of A, P, .... G and B being given in the following table : 


Prob- 

ability 

A 

B 

a 

D 

E 

F 


B 

0-10 

0-06 

0-026 

0-01 

0-001 

1-281662 

1-644884 

1- 969964 

2- 326348 

3- 090223 

-0-0724945 

-0-0201808 

0-0687179 

0-2337877 

0-8433066 

0-00776- 

0-05986 

0-09659 

0-07888 

-0-69096 

0-00227 
-0-01082 
- 0-02357 
-0-03176 
0-04591 

0-04431 

0-06346 

-0-00731 

-0-21997 

-0-41871 

-0-01657 

-0-03676 

-0-02362 

0-0'4640 

0-22738 

0-000484 

0-001843 

0-002195 

0-000386 

-0-008995 

-82-2 

21-0 

143-3 

227-6 

115-8 


The terms in the first four columns agree with, or have been derived from Cornish & Fisher 
(1937). The A’s are semi-invariants derivable from the exact values of the moments given 
at (1-2). 





B. C. Geary 


91 


As a test, the following is a comparison of the 0-06 and 0-01 probability points for «, = 25 
as derived by B. S. Pearson (1930) (using a Type VII curve) with the values from (11-9); 


Probability 

level 

Pearson 

Geary 

(11-0) 

0-06 

0-7 H 

0-707 

0-01 

1-061 

1-062, 



Variate (s.n. = 1) 

Fig. 3. Frequency of ^6^ for n = 3. 


With standard deviation cr == 0435 it is obvious that the differences are not important. 
Sample number 26 is the lowest for which Pearson, computed the probability points, and 
for two levels only. The formulae at (11-9) can probably be accepted with confidence, 

12. OoNonnsioN 

Prom frequency formulae (5'2), (6‘8), (7’26), (8-8), (940) (with (9-6) and (9-9)) and (104) 
the probability points for for normal random samples of = 3, 4, 6, 6, 7 and 8, respec- 
tively, can be determined without difficulty. The six frequency distributions are illustrated 




92 Frequency distribution of Jb^ 

in Figs. 3-8. On each of the frequency curves there is superimposed tjie normal frequency 
with the same standard deviation, the intention being to enable a contrast to be made 
between the several curves by reference each to the normal frequency, and to show the 
fairly rapid approach of the -^b^ frequency to normality with increasing n, even for small 

samples.* , 

In this research nothing was so remarkable as the transformai^gin which the single step 
in the iteration, namely that from ra = 5 to w. = 6, effected in the'^fehape of the frequency 
curve. From w = 6 on, the join at the links is effected so smoothly as to be almost imper- 
ceptible to the eye. The eye, however, flatters the actual approach to normality in the 
frequency curves, as measured algebraically by the probability points. 



Variate (s.d. = 1) 

I’ig. 4. Frequency of ^bi for n = 4. 


It may be well, at this stage, to recapitulate. Using integral iteration formula (2' 13) 
(or (2- 14)), frequency ordinates were computed at values of the variate termed the ‘linlrs’ 
at which the frequency is shown to have functional discontinuities. Using the exact values 
of the moments (given at (1-2)), and taking into account the known order of contact (§ 3) of 
the different functions at the links and the known form assumed by the frequency at the 


Aa Ri. A. Fisher (1930) has shown, the approach to 
creasing n, as indicated, say, by /S.j. See p. 90 above. 


normality is not, however, uniform with in- 



R. 0. Geaby 


.93 

known limit of range, inter-link frequencies were determined in polynomial form. Attention 
is directed to the use, at -the w = 4 to 7 (inclusive) stages, of the higher moments for the 
purpose of checking the general reliability of the frequency curve (or rather series of curves 
joined at the links). 

Of far greater practical importance, however, are the formulae (11-9) designed for the 
estimation of the 0-10, 0-05, 0-025, 0-01 and 0-001 probability points for normal random 
samples of for n^8. There will be little trouble about finding the corresponding formulae 
for other probability links. What degree of confidence can he reposed in these formulae? 
This raises in an acute form the vexed question (on which the protagonists of different schools 
were prone to get very vexed indeed a generation ago) of how best to use moments (or semi- 
invariants) for estimating frequency distributions. The general problem was constantly 



Variate (s.n. = 1) 

Pig. 5. Frequency of for n = 6. 


in the writer’s mind during the present research and he would be glad if his colleagues could 
study the possibihties of the methods which culminated in formulae (11-9) for bridging 
the chasm which still divides the knowledge (sometimes exact) of the lower moments of 
statistics like and and the formulae (however empirically estabhshed) for the frequency, 
in which a measure of confidence can be reposed. This fundamental problem was abandoned 
some years ago in a thoroughly unsatisfactory condition. 

The Karl Pearson approach consists essentially in having regard to the ‘shape’ which 
experience has shown that frequency curves tend to assume and to use the first four moments 



94 


Frequency distribution of 

for the purpose of determining the constants of the curve. The disadvantage of the Pearson 
method is that of itaeff it gives no indication as to whether the resulting curve closely follows 
the actual frequency; it is necessary to have recourse to such devices as comparing the curve 
with a frequency distribution determined from hundreds of random sample computations 
of the statistic under examination. Apart from the tediousness of this method it is often 
indecisive in regard just to the parts of the frequency which are of most importance, namely 
the ends, because the small numbers which the check computation throws into these zones 
are usually subject to large (Poisson) errors. 



The Gram-Charlier system, on the other hand, can only be used with confidence when the 
frequency is fairly close to the normal. In practice the reliability is judged by the con- 
vergence of such terms as one can compute from the moments, i.e. if the successive terms 
show an ‘ unmistakable ’ tendency to diminish one feels confident in the computed frequency. 

Obviously what both the Pearson, the Gram-Charlier and other frequency systems require 
is a Remainder Theorem. Since, however, an infinite number of moments are required to 
define a frequency distribution, with only a few moments known the most that can be 
expected is that upper (or lower) Umits of the probability of the statistic can he established as 



R. C. Geaey 


95 


functions of the known moments. This is what Tchebychev’s Theorem, and theorems of the 
type, do. Too much cannot be expected from the knowledge of a few moments; the approxi' 
inations are almost invariably too rough for statistical use, when a high standard of efficiency 
is required; and M. Frechet (193-7) has shown that the Tohebychev type approximations 
are the best, given the assumptions, which can be made. For all their great mathematical 
importance (incidentally for their justification for the statistician of ‘the faith that is in 
him’), it seems to the writer that research on these lines will not produce formulae which 
will be statistically utilizable in general conditions; but he may be quite wrong. 



Knowing the earlier moments the Cornish-Fisher type expression (depending on the 
Gram-Charlier form of frequency) gives, at any probability level, an expansion for the 
variate to a defined order in the sample number. As might be surmised from the coefficients 
of the normal moments (e.g. (1-2) above), the coefficients in powers of n-^ in the expansion 
of the variate usually tend to increase rapidly. In the present paper a remainder term of 
suitable order in n has been added to the known terms in the former , expansion and its 
coefficient found by reference to the (assumed) exactly known expansion for u, = 8. Clearly 
two more terms (in n~^ and m^®) respectively could have been found had we iterated the 
frequency to m = 9 and n = 10, respectively, though this was not deemed necessary in the 



[ 98 ] 


ON THE COMPUTATION OE UNIVERSAL MOMENTS OP TESTS OP 
STATISTICAL NORMALITY DERIVED FROM SAMPLES DRAWN 
AT RANDOM PROM A NORMAL UNIVERSE. APPLICATION TO 
THE CALCULATION OF THE SEVENTH MOMENT OP 6^ 

By R. C. GEARY and J. P. G. WORLLEDGE 


1. Intkodtjotort 


The principal object of this communication is to develop a computational technique appro- 
priate to the formula given by one of the authors (Geary, 1933). By way of illustration the 
formula is applied to the computation of the seventh moment of 




ml 


is=l 


( 1 - 1 ) 


where ajj, ... are the measures of the random sample of n and of which x is the arith- 
metic mean. Universal normality is assumed throughout. 

A glance at formula (3-9) in which this paper culminates will indicate that the task of 
deriving higher normal moments of is not one to be undertaken in a frivolous spirit. The 
work finds its main justification in the conviction of the authors that accurate (if not exact) 
values of the probability points of bj can be found in terms of the moments of bg for all values 
of n using a method which has proved successful in the case of the analogous test of asym- 
metry', involving ^ 

A = 3 = ( 1 - 2 ) 


In turn, the importance of the determination of aoourate probabilities for and bg for 
normal samples derives from the facts revealed by unpublished work by one of the authors. 
This- shows (1) that probabilistic inferences drawn from the well-known significance tests 
based on the assumption of universal normality are apt to go astray when, in fact, the universe 
is not normal, and (2) that and provide the most efficient tests of asymmetry and 
kurtosis, respectively, in indefinitely large samples, amongst wide fields of alternative tests 
and of alternative non-normal universes. 

R. A. Eisher (1930) has given the exact values of the second, fourth and sixth moments of 
^/bj and J . Pepper (1932) the eighth moment. In the former paper R. A. Fisher also gave the 
values of the second and third moments of bj. The moment field was extended by J. Wishart 
(1930) and in a joint paper by R. A. Fisher & J. Wishart (1931). 0. T. Hsu & D. N. Lawley 
(1940) gave the fifth and sixth moments of bj. All these authors used the combinatorial 
method due to R. A. Fisher (1929). The present approach is entirely different. 


2. The fundamental relation 

To make the expose complete it may be useful to reproduce the relevant part (which is quite 
brief) of the 1933 paper. The method used is due essentially to C. C. Craig (1928), applied to 
the normal case. Using, in the usual notation, a prefixed U to indicate ‘ expected ’ or, more 



R. C. Geaby and J. P. G. WoRLLEDas 99 

accurately, ‘average value for all samples’, we have for the characteristic function of the 

Zi^Xi-x (i = 1,2,,..,7i), (2*1) 

the expression JJexpj s (2’2) 

where the are n parameters, so that 

where the are any p positive integers, is the coefficient of 
in the expansion of 

»! ! aa ! . . . aj, ! ^ exp {t^{xj^ - S) + -*) + ... + - *)}. 

The exponent can be written in the form 

- ?) + x^it^ -<)+... + - i), 

n 

where < = S Since the are independent, 
i=l 


(2-3) 


£Q:icpi:xiiti-i) = n EexpXi[ti-t). 


i=l 


(2-4) 


Assuming, as we may without loss of generality, that the normal universe of the x^ has 
mean zero and unit standard deviation, we have 

E exp xSi - i) = exp - If. 


Hence 


(2-5) 


... o„! j;expS*i(»i— ic) = ...a^lexp^E (ti — lf. 

i=i 

By definition, the power f of a term is given by / = S the dimension by p. It is clear 

i 

that the required universal mean value of 

E{Xi — xfi . . . (ajj , — xfp 
will be found as the coefficient of in the expansion of 




where 2k = f. 


p. 


( 2 - 6 ) 


3. The computational scheme 

The computational scheme, which is quite general, will most clearly be outlmed by reference 
to the computation of the exact value of a specific moment (from origin) Pyinii), for the deriva- 
tion of which it was primarily designed. Then 


^ *(4 + 4 +•■■+ 4)’ - ^ [«®( ■ 28 •)+”(»- 1) 

+ 54^iS(-20-8) + i®(a6.12.))+»(!.-l)(»-2){ji4^ 


^(•24-4) 

J?(-20-42) 



100 


Cofnputation of w/iiwrsal owowieuts of tests of statistical noTmality 


+n{n -l)[n~ 2) (w ~ 3) (% - 4) {n - B) gTYfr^ 

+ n(ro - 1 ) (n - 2) (n - 3) (Ji - 4) (w - 6) (« - 6) ^^(4’) J , 


(3-1) 


wKere, for example, 

F(-l2-84'‘) = Ez^1444 == jB(a;i-'$)i®(a:a-x)®(a:3-S)^(a:4-«)^ 

There are, accordingly, fifteen terms made up of one of dimension one, three of dimension 
Wo, four of dimension three, etc. The structure of the numerical coefficients will be noted: 
in particular that, when the power of a factorial appears in the denominator, its factorial 
also appears. Bach of the fifteen E terms will be evaluated separately, grouped by dimensions 
and multiplied by the n-factors. 

As already stated, the value of E{aia^a^...) will be found as the coefficient of 


in the expansion of (3’2) 

•with V = - Ijn. In this ease, of course, / = ~ 28. 

Expand (3-2) in powers of v by the binomial theorem. Each of the v power terms will, in 
general, make a numerical contribution to the value of E{aia ,^ ...) which will, accordingly, 
be represented by a polynomial in v of degree 14. The term in v* will be 


ajoa! 


14! r® 


dl4-8)l(2s)lS- 


(3-3) 


14121^ s!(14-fi)r‘^ (ai-25i)!5i!(cJ2-2s2)!s2! ‘ 

In the Eg, summation extends to all non-negative integer series s^, s^, so that Ss^ = ( 1 4 — s), 
8^ being associated with a^, with etc. The values which the can assume are obviously 
restricted further by the condition that 


®i ^ 28^. 

Let the series Ei, Ep . . ,) be termed the reciprocal factoridH vector (hereafter usually written 
‘r.f.v.’) otapdp..., the terms of the vector being regarded as of the order indicated by the 
subscript. The vector will be’indioated by clarendon type. From the computational point of 
view the following relation is fundamental ; 

A X B == AB, (3-4) 

whereAa: (%fit2...) and B = (6162 ...), and any other r.f.v. The multiplication sign at (3'4) 
is defined as follows : the terms of A are multiplied respectively by v^, v®, etc., and added to 
give a scalar A] the terms of B in the reverse order are also multiplied respectively by 
V®, v^, v^, ... and summed to give B. The coefftcients of v®, v®, ... in the product (in the 

ordinary sense) AS give the vector AB. Relation (3-4) is immediately 'evident from the form 
of Eg in (3*3). From this relation it is quite easy to build up r.f.v. ’s from those of lower order 
44 from 4, 84 from 8 and 4 , 88444 from 8 and 8444 , or 88 and 444 , etc, 



101 


R. 0. G-eaey and J. P. G. Worlledge 


Having foiind. all fifteen r.f.v.’s the second step in the computational process is to form the 
scalar product of each r .f.v. and (2s) ! ! — the latter will be termed the v-muUipliers which, 
it is important to note, are the same for aU the terms in (S-l) — ^which, from (3-3), gives 
E[a^a^ ...) divided by 

The latter are multiplied by the numerical factors in (3-1) to give what are termed the 
constant multipliers, ‘ constant ’ in the sense that they are the same for all the v power terms 
in each of the B's in (3'1), but these constant terms are different for the different B'&. Por 
example, the constant multiplier for the term E^x^\zlz\ is 


12!8!4!®7! 
3!2! I!»2!2i4' 


(3-6) 


Note the ‘.absolute constants’ 7 ! and2^^*, and that the powers ofthe term appear as factorials 
in the numerator and factorials one-fourth of these powers in the denominator. In the 
denominator is also a 2 ! which is the factorial of the factorial power. 

The third step in the computation is to sum the terms of the same dimensions. The final 
step consists in the multiplication of the terms of the different dimensions by the v-f actors 
as follows: 

Table 1. v-f actors 


Dimension 

v-faotors 

1 


2 

— (r“ + V®) 

3 

2r®-t-3r'-f-vi 

4 

— ( 6 (J® 4- 1 Ir' -f V®) 

6 

24v® + SOv” 36v* + lOr® + r® 

6 

- ( 1 20 r« + 27 4r® + 226>'‘ + 861 -® -f 1 Sr® + v) 

7 

7201^® + 1764r® + 1624v« + 736>'® 176r® + 21r + 1 


The ^'factors at (3'6) are, of course, the ■a-factors in (3-1) with v = — 1/n. 

To deal with the very large whole numbers and their reciprocals which arise in factorial 
computation we had recourse to a prime number index notation. Por this purpose the number 
is factorized into powers of the lower primes — ^we have used the notation for primes not 
exceeding 31. Thus 

746,137,199,808,000 = 6847- 13i-lli'72-53>3®-29 
is written in this notation 6847[1 12369], 

the digits in the square brackets [ ] being the powers ofthe lowest primes arranged in ascending 
order from the right. The ordinary number 6847 will be known as the coefficient and the 
symbolical number in square brackets as the primal of the original number. Note that in 
this example the notation affects an economy from 15 to 10 in the number of digits 
required to describe the number. Should the original number not be factorizable by a 
particular small prime a 0 will be inserted in the proper place, e.g. [10358] means that 7 is 
not a factor of the number represented. If, as often happens with the first two primes, the 
indices exceed 9, decimal points are used, e.g. [124- II ■ 17] means that the original number 
has 21’ and as factors. The primal notation can be used when the indices are all positive 




102 Computation of universal moments of tests of statistical normality 

or all negative: occasionally, however, + and - signs have to be naixed in the primal (see 
Table 6). 

With little practice great facility is acquired in applying the ordinary rules to numbers 
in primal notation. For multiplication or division .corresponding digits in the primals are 
added or subtracted, the coefficients being dealt with in the ordinary way. In addition or 
subtraction common factors in the primals are immediately evident and the coefficient of 
the sum (or difference) is derived usually by. a single product-sum (or product-difference) 
operation, on a multiplying machine. It may be observed that all the work for this paper 
was executed without inconvenience on small hand multiplying machines with capacity 
9x 8x 13. 

In the following tables the first thirty-two factorials, the r-multipliers and the constant 
multipliers required for the computation of are expressed in primal notation. 


Table 2. Factonah in primal notation 


0! = 1! = 

[0] 

2! = 

[1] 

31 = 

[11] 

4! = 

[13] 

6! = 

[113] 

6! = 

[124] 

7! = 

[1124] 

8! = 

[1127] 

9! = 

[1147] 

101 = 

[1248] 

11! = 

[11248] 

121 = 

[1125-10] 

13! = 

[11126-10] 

141 = 

[11226-11] 

18! = 

[11236-11] 

16! = 

[11236-16] 


17!= [111236 ’IS] 

181= [111238'16] 

191= [1111238-16] 

201= [1111248-18] 

211= [1111349-18] 

221= [1112349-19] 

23!= [11112349-19] 

24!= [1111234‘ 10-22] 

26!= [1111236-10-22] 

26!= [1112236-10-23] 

271= [1112236-13-23] 

281= [1112246-13-26] 

29!= [11112246-13-26] 
30!= [11112247-14-26] 
31! = [111112247-14-26] 
32! = [111112247-14-31] 


Table 3. v-Multipliers in factorial and primal notation 

Term in Coefficient 

p” : 0 ! 0 !-i = [0] 

: 2! 11-1= [1] 

: 41 2!-i= [12] 

P’ : 6! 3!-i= [113] 

V* : 8! 4!-i= [1114] 

p' : lot 6!-i= [1136] 

P« : 12! 6.!-i= [lli36] 

P’ : 14! 7!-i= [111137] 

P* : 16! 8!-i= [111248] 

P» : 18! 91-1= [1111249] 

P“ : 20!10!-i= [1111124-10] 

P“ : 22!ll!-i= [1111226-11] 
i 24!12 !-i = [11111226-12] 

P” : 26113!-!= [11111246-13] 

: 28! 14!-!= [11111248-14] 



R. C. Gbaby and J. P. G. Wobllbdge 


103 


Table 4. Constant multipliers in factorial and primal notation 


Required for 
computation of the 




undermentioned 




term in (3-1) 




E{2&) 

28l7!/7!2« 

= 

[1112246-13-11] 

B(24-4) 

24!4!7!/6!l!2M 

= 

[1111244-11-11] 

.jE;(20-8) 

20!8!71/6!2!2W 

= 

[111146-11-11] 

JE(16-12) 

16!12!7!/4!3!2ii 

= 

[1246-11-11] 

i?(20-4“) 

20I4!=7!/6!1I“2!2»<' 


[111134-11-10] 

Ay 16 -84) 

16I8!4!7!/4!2!1!21‘ 

= 

[1146-10-11] 

J5(122 4) 

12I24!7!/3P2!1!2M 

= 

[236-11-10] 

B(12-8q 

12!8«7I/3!2!!>2!21* 

= 

[146-10-10] 

B(16-48) 

16l4!«7!/411!»3!2i^ 


[1134-9-10] 

jB(12-84q 

12!8!4P7!/3!2!1122!2i< 

= 

[134- 10- 10] 

A(83 4) 

8!M!7f/2!»3!ll2i^ 

= 

[44-8-10] 

B(l2-4'‘) 

12!4!'‘71/311U4!2‘'' 

= 

[12398] 

A(82 4'') 

8IM!=7!/2!=>2!11»3!21« 


[3389] 

A(84q 

8!4!5 7!/2!l!'612i'‘ 

= 

[2188] 

JEf(4q ' 

4P7!/1 F7121* 

= 

[77] 


The theory will be illustrated by reference to the computation of = ^(8^43). 

First the r.f.v. 88444 is found as the product 884 x 44 by setting down in equal spaces the 
berms of 884 and on a movable slip spaced to the former the terms of 44 in reverse; 

All primals are negative 


884 

C 27 I 

sM 

109 69 ] 

in 03^ 

r— 

1493 0t4>»j 

399 i::.|^ 

119 Ii:4-i0 

I8436J5-I51 

— 

3) [>25‘l^ 

[!J5.|7] 













MOVABLE 

SUP ^ 


Di) 

7 Oil 

[•] 

BO 

44 






The term in 88444 from the position illustrated is that of the 6th oi'der, namely, 

6[4 • ]^] + 109[4 • 12] + 1 11 • 7[14 • 10] + 803[n2 • 1 1] + 1493[n4 ■ 12] 

= [114-13](6-7-6+109-7.6-2 + 777-7-84-803-9-4+1493-2) 

= [114- 13] (3-27737) = 27737[113- 13]. 

The manner of computation is indicated : first the largest (negative) digits in each of the four 
positions of the primals are underlined and the underlined set is regarded as the common 
factor. Note how, at the final stage, the factor 3 of the coefficient reduces the primal digit 
from 4 to 3. From the entries in the round brackets ( ) it wiU be clear that, as stated above,, 
the procedure is well adapted to the multipl 3 dng machine. The full calculation of 88444 is 
shown in Table 6. 

The identity of the r.f.v.’s from the two factorizations of 88444 constitutes an absolute 
check of the work. The calculation of required for (3-1) is completed in Table 6. 

In practice the figures in columns (4) and (6) of this table were derived from those in column 
(3), and in Tables 4 and 6 by entering the latter on two movable slips and folding opposite 
each entry, as required. This stage of the work was rapidly executed. The sum-product of 
columns (1), (2) and (5) give the value of All the r.f.v.’s required for the calculation 

of the E’s for (3-1) are given in the appendix. 



104 CoTfiputation of univBTsciX moments of tests of stotisticcil norTncdity 


Table 6. Calculation of reciprocal factorial vector 88444 
All primals are negative 

(i) By 884 X 44 

0 : [29] 

1 : [2-8] + 6[29] 

2 : 7[3-10] + 6[28] + 109[3-ll] 

3 : [3'10] + 36[3-10]+109[3aO] + lll[m] 

4 : [4'13] + 6[3-10]4-763[4-12] + Hl[138] + 803[H2-12] 

5 : 6[4'13] + 109[4'12]4-777[14-10] + 803[112-11] + 1493[114-12] 

6 : 109[6-16] + lU[ji4‘10] + 6621[113-13]+l493[n4-ll]-t-389[122-14] 

7 i 111[15-13H-803[U3-13]+1493[16-13]+389C122-13H-U9[124-13] 

8 : 803[114- 16] + 1493[115' 13] + 389[23- 15]+ 119[124- 12] + 1643[235 • 17] 

9 1 1493[116'16] + 389[123-16] + 119[26-14]+1643[226-16]+31[226-17] 

10 : .389[124'18] + 119[126-14] + 1643[126-18] + 31[226-16] + [226-19] 

11 ! H9[126-17]+1643[226-18] + 217[226’18] + [226-18] 

12 ; 1543[227-21] + 31[226-18] + [126-20] 

13 ! 31[227-21] + [226-20] 

14 : [227-23] 


r.f.v. of S8444 

[29] 
ILSO] 
9[-ll] 
947[13-10] 
1811[110'13] 
27737[113-13] 
1783141[126-15] 
20627[116'13] 
: 1772417[226'17] 
: 647889[226‘17] 
= 151331[226-19] 
: 127[223-18] 

= 2329[227-21] 

: 37[227-21] 

= [227-23] 


(ii) By 8844 x 4 

0 : [29] 

1 : [29] + [18] - 

2 : [3-11] + [18] + 86[3-10] 

3 [2-10]+85[3-10]+169[12-10] 

4 ! 86[4-12] + 169[l2-10] + lUl3[104-13] 

6 ! 169[13-12]+11113[104-13] + 6137[114-11] 

6 : 11113[106-16] + 6137[114-U] + 22703[124-13] 

7 : .5137[U6-13] + 22703[124-13] + 9341[126-13] 

8 : 22703[126-16] + 9341[126-13] + 90541[226-17] 

9 i 9341[126-16] + 90541[226-17]+2463[226-16] 

10 ! 90641[226-19] + 2453[226-ia]+137[l26-18] 

11 : 2463[226-18] + 137[126-18] + 17[226-18] 

12 : 137[127-20]+17[226-18] + [226-21] 

13 ; 17[227-20] + [226-21] 

14 : [227-23] 


= [29] 

= 7[29] 

= 9[-ll] 

= 947[13-10] 

= 181].[110-13] 

= 27737[113-13] 

= 1783141[125-16] 
= 20627[116-13] 

= 1772417[226-17] 
= 547889[226-17] 
= 161331[226-10] 
= 127[223-18] 

= 2329[227-21] 

= 37[227-2li 

= [227-23] 


Einally, the JS’s are multiplied by the appropriate v-factors given in Table 1, to give the 
value of E{m\). Now B. A. Bisher (1930) (see also Geary, 1933) has shown that 

= mi) = ( 3 - 7 ) 

-Etwil*) = (w,~l)(n + l)(w+3)... (n + 23)(rfc + 26)/»iih (3-8) 

Finally, = (3W3 + 21 1 - 3 W 2 + 64,802 13, 164,290- 3 

+ 608,584,331 -SV +26,489,306,481 • 3W + 74,020,784,452- 7 - SW 

- 72,634,861,124 - 7 • 5 • 3«»« + 407,081 ,273,656 - 7 • 6 - 3«n® 

- 1,287,510,783,723- 7- 6- 38»«+ 2,526,463,322,982 - 7 - 5 • 3 

- 280,521 ,238,122 - 11 ■ 7 • 5 • 3%® + 3.036,544,767 • 13 • 1 1 • 7 • 6* - 3% 

- 135,393,525- 13- 11 •7®5a3«)/(» + 1) (%+ 3) ... (w+ 23) (w+ 25). (3-9) 



106 


H. C. Geary ari> J. P. G. Worlledge 

4. Corroboration or formulae 

An integral part of the present work is the technique of check. To be of value the formulae at 
(3'9) and (S-IO) must be absolutely correct because (1) any errors made in factorial work are 
fairly certain to be large and (2) the formulae are designed for use when n is small, when 
relatively small errors in the numerical coefficients may materially affect the results . Further- 
more, it is almost impossible to avoid error (even in a joint work like the present) with so 
many individual calculations involving numbers astronomically large. As will appear, there 
is a satisfactory, though not absolute, check at the final stage; but if it reveals error it does 
not show where the error occurred, so that, if this were the sole check, there would be no 


Table 6. Calculation of E{8H^) from 88444 


Term 

(1) 

r.f.v. 

88444 

(3) X r-multiplier 
(Table 3) 

(4) 

(4) X constant 
multiplier [3389] 

(6) 

Coeffioient 

(2) 

Primal (nog.) 

(3) 

vO 

1 

[29] 

[-2-9] 

[3360] 


7 

[29] 

[-2-8] 

[3361] 

ui 

9 

[•11] 

[1-9] 

[3390] 

v8 

947 

[i3ao] 

[-2-7] , 

[3362] 

pi 

1811 

[110-13] 

[1-9] 

[3390] 

pi 

27737 

[113-13] 

[-8] 

[3381] 


1783141 

[126-16] 

[10-1-2-9] 

[13260] 

v’’ 

20627 

[116-13] 

[1100-2-6] 

[113363] 

pS 

1772417 

[225-17] 

[11-10-1-9] 

[112370] 

pi 

647889 

[226-171 

[111-10-2-8] 

[1112361] 

j;10 

161331 

[226-19] 

[1111-10-2-9] 

[11112360] 

vll 

127 

[223-18] 

[1111002-7] 

[111133-10-2] 

,,14 

2329 

[227-21] 

[1111100-2-9] 

[111113360] 

pl3 

37 

[227-21] 

[1111102-2-8] 

[111113661] 

yU 

I 

[227-23] 

[11111021-9] 

[111113690] 


alternative but to face the tedium of complete recalculation. It is essential to devise an 
absolute check at each stage. This has been done for the present technique. 

The first check is the w = 1 (or i* = - 1) check. This is applicable to the E’s (see (3-1)) of 
dimension one, two and three. It derives from the fact that 

{(l + v)2’«|+2vEpi<^)''- (4-1) 

It will be immediately evident from the latter that when v. = -1 the following terms vanish 
identically; 

(i) all terms of one dimension; 

(ii) all terms of two dimensions except those of the type tl^tl^ with which we are not 
concerned; 

(iii) all terms of three or four dimensions in which the highest power exceeds 14 : the latter 
being the highest power which, say, can assume in the expansion of 

V i>i / 








106 Cbmp'utation of univefsal mom&nts of tests of statistical noTmality 


Even in terms of three dimensions in which the highest power is less than 14, e.g. in 

the V = -1 test can be exploited. In fact, from (2-6) and (4>1) the required terms for 


i» = - 1 are 


j;(12S'-4) = 


14! 12!24! 7! 

10! 2”!® 3!®2! 14!2« 


21*== 12![11124], 


^( 12 - 82 ) 


14! 121812 7! 

6!22!3!2!»2!14!21* 


21*= 12![3116]. 


The sum of these two terms is 131[1236* 14] which should be the sum of the four E terms of 
dimension three in (3'1) since ^(20-42) and JS?(16-84) are zero (for = -1). The checks 
specified in this paragraph were fully applied to the terms of one, two and three dimensions 
in (3-1) before multiphoation by the v-factors (Table 1 ). 

Reciprocal factorial vectors for dimensions exceeding two were checked fully by the 
‘double’ factorization technique exemplified in Table 6 . In view of the simplicity of the 
two subsequent processes, namely those of the v- and constant multipliers, this check may 
be taken as establishing the accuracy of the E’b of dimension three or more. Reference may 
nevertheless be made to a check at this stage, namely that the ratios of consecutive coefifi- 
oients in each E exhibit a marked regularity, if correct. Any irregularity (which in the nature 
of the work will usually be large) must be suspect. 

Assuming the accuracy of the E’a in (3-1) the final stage was checked by multiplying by 
he v-faotors (3' 6 ) in two ways: 

(i) by straight multiplication using the primal notation; 

(ii) by taking (in (3*1)) 

= ( 1 -Vp) (l+ 2 v) ...(l + fiv)Ai4-v(l + v) ... (l + 5v)Ajj-i-,.. + v®A 7 
= (l + v)...(l + 6 v){(H- 6 v)Ai+j;Aa}+..., 
and computing in successive stages 

Rj = (1 + 6 v)Ai + pA.js, R 3 = {l + 5v)Bj. + v^A^, etc. 

The results were the same. 

A satisfactory check for the final stage is that of % = 4 . A. T. McKay (1933) has, in fact, 
given a formula for this value of n from which the seventh moment from zero of b, is found 
to be 

82,220,810,261/6211 • 13- 17 • 23 • 29, 

which value also transpired on substituting 4 for n in ( 3 ' 9 ). This establishes the accuracy 
of all the formula except possibly the part accruing from the terms in (3T) in 

?i('a- 1 ) ... (n-4), and %(« — 1 ) ... (n — 6 ) 

which vanish when n = 4 . 

If it adds nothing to the check in the previous paragraph it is nevertheless of interest to 
observe that, for n, = 3, the value of the seventh moment of is found to be (|)’ which is as 



R. C. GtEary and J. P. G. Woblledge 107 

it should be since, in this case, each b.^ assumes the constant value f , whether the samples 
are normal or not. 

A partial check is also aflforded at the final stage by the vanishing of all coefficients of 
powers of v from to inclusive. 


6. COUOLTTSIOlir 

Previous investigators in this field have all used the combinatorial technique, invented by 
R. A. Pisher (1929) and applied in the first instance to the cumulants, which are linear 
functions of the sample moments. The present writers have not had sufficient experience in 
working the Fisher technique to decide which method is easier to apply. It is quite likely 
that the Fisher method is shorter. A strong point of the present computational scheme is 
that it lends itself to check at every stage; and the method may appeal to students who 
prefer the algebraical or arithmetical to the geometrical approach. For their benefit, and 
also in case it may later be found necessary (in connexion with the accurate determination 
of the probabihty points of for samples of all sizes) to compute higher moments than the 
seventh — ^it is almost certain the seventh will be required — ^we give as an appendix an 
extended series of reciprocal factorial vectors. From these can be derived without difficulty 
(i) corresponding F’s, e.g. on multiplication by appropriate v- 

and constant multiphers and (ii) r.f.v.’s of higher powers. 


appendix 

A selection of reciprocal factorial vectors required for the calculation of moments of for normal 
samples, including all used for the calculation of the seventh moment, in primal notation 

All primals are negative 


Order 

4 

8 

48 

12 

84 

0 

[1] 

[13] 

[2] 

[124] 

[14] 

1 

[1] 

[12] 

[1] 

[Hi] 

[4] 

2 

[13] 

[14] 

7tl3] 

[26] 

31[26] 

3 . 


[124] 

[13] 

[136] 

7[116] 

4 


[1127] 

[26] 

[1128] 

127[n28] 

S 




[1148] 

17[1138] 

6 




[1126-10] 

[113-10] 



43 

16 

12>4 

88 

848 

0 

[3] 

[1127] 

[125] 

[26] 

[15] 

1 

3[3] 

[1126] 

[123] 

r24] 

[13] 

2 

13[5] 

[137] 

13[136] 

6[26] 

17[26] 

3 

3[4] 

[237] 

[36] 

31[136] 

63[126] 

4 

13[17] 

[113-10] 

47[H38] 

323[1139] 

2497[1138] 

6 

[17] 

[1259] 

29[1247] 

[1028] 

173[1137] 

6 

[39] 

[1126-11] 

167[11259] 

43[124-10] 

7[139] 

7 


[11225-11] 

[11249] 

[124-10] 

[1049] 

8 


[11236-16] 

[1126-13] 

[224-14] 

[114-13] 



108 Computation of universal moments of tests of statistical normality 


Order 

41 

20 

16 4 

0 

[4] 

[1248] 

[1128] 

1 

[2] 

[1148] 

[1028] 

2 

19[14] 

[113-10] 

11[13-101 

3 

6[4] 

[124-8] 

47[1238] 

4 

49[17] 

[124-11] 

263[124-11] 

5 

6[ie] 

[136-11] 

89[126-11] 

6 

19[38] 

[1126-13] 

61[lll6-13] 

7 

[38] 

[11226-12] 

1277[11226-12] 

8 

9 

iO 

[4-12] 

[11236-10] 
[111238-10] 
<[1111248- 18] 

229[11234-16] 

[11136-16] 

[11237-18] 



8«4 

84’ 

4’ 

0 

[27] 

[10] 

[6] 

1 

5[27] 

5[16] 

6[6] 

2 

109[39] 

13[8] 

126[n] 

3 

111[137] 

143[126] 

36[18] 

4 

803[112<10] 

1399[1119] 

646[28] 

5 

1493[114-10] 

239[1029] 

23[8] 

6 

399[122-12] 

0943[114-11] 

646[3-10] 

7 

119[124-11] 

65[104-10] 

36[89] 

3 

1643[226-16] 

277[114-14] 

126[4-13] 

9 

3I[226- 16] 

23[U6-14] 

6[4-13] 

10 

11 

12 

[226-17] 

[116-16] 

[6-16] 


12-8 


[137] 

[37] 

31[139] 

7[227] 

299[123-10] 

73[116-10] 

3713[U26a2] 

83[U26-n] 

1181[1236'16] 

47[1237-16] 

[1237-17] 


24 

[1126-10] 
[11249] 
[126-11] 
[126-11] 
[224-14] 
[236-12] 
[1137-14] 
[11236-14] 
[11237-18] 
[111239-17] 
[1111248-19] 
[1112349-19] 
[1111234- 10-22] 


12-4* 

[126] 

[26] 

10l[138] 

19[136] 

1163[1149] 

63[239] 

629[1106-11] 

479[1126-10] 

773[1126-14] 

13[1120-14] 

[1127-16] 


20-4 

[1249] 

[1238] 

63[126-10] 

31[125-10] 

23[124-13] 

13[136-lli 

161[1130.13] 

733[H230-13] 

1163[11237-17] 

687[111238-ie] 

67[1101247-18] 

[1111049-18] 

[1111249-21] 


16-8 28 

0 [113-10] [11226-11] 

1 [1129] [11126-11] 

2 13[104-11] [1126-13] 

3 43[114-11] [1136-12] 

4 3823[224-14] [236-16] 

6 1 507[226-12] [238-16] 

6 4933[il36-14] [1237-17] 

7 28943[11236-14] [11337-16] 

8 4331[1237-18] [11248-19] 

9 79[11236-17] [111249-19] 

10 43[11237-191 [1111249-21] 

11 37[11348-19] [111234-10-20] 

12 [11348-22] [1111234-10-23] 

[1112236- 10- 23] 
[1112246-13-26] 


128 

[248] 
[237] 
23[249] 
47[259] 
289[124-12] 
593[136-10] 
8531[1137-12] 
193[113612] 
929[1236 16] 
113[1238-16] 
37[1248-17] 
[1249-17] 
[224-10-20] 


8!i 


8 * 4 “ 


[39] 

[28] 

[- 10 -] 

47[12-10] 

89[101-13] 

281[113.11] 

2833[I24-13] 

11[121-13] 

2213[224-17] 

2089[236-16] 

71[236-18] 

[236-18] 

[336-21] 


[28] 

[17] 

86[39j 

169[129] 

11113[104-12] 

6137[114-10] 

22703[124-12] 

9341[125-12] 

90641[226-16] 

2463[226-16] 

137[120-17] 

17[220-17] 

[226-20] 



El. C. Geaby and J. P. G. Woblledge 


Order 

24-4 

20-8 

20-42 

0 

[1125-11] 

[126-11] 

[124-10] 

1 

[1025-11] 

[26-11] 

[24-10] 

2 

139[1120-13] 

19[124-13] 

179[126-123 

3 

47[1126-12] 

331[136-12] 

87[126-11] 

4 

79[226-16] 

7001[236-16] 

161[26;14] 

6 

31[137-15] 

1489[236-16J 

1501[136-14] 

6 

1061[1237-17] 

26613[1237.17] 

1609[1036-16] 

7 

79[11236-16] 

197[11127-16] 

27487[11236-14] 

8 

73[11138-19] 

3021 1[1 1247 -19] 

13043[11137-18] 

9 

59[101239-19] 

168713[111249-191 

209509[111238-18] 

10 

6611[1111249-21] 

310841[1111249-21] 

2197[1101244-20] 

11 

8011[111234- 10-20] 

127[1111336-20] 

63658[1111249-19] 

12 

1697[1111224-10-23] 

4013[111134- 10-23] 

1873[U11248-22] 

13 

47[1111234-10-23] 

109[I11135- 10-23] 

101[U1124- 10-22] 

14 

[1111234-11-25] 

[111135-10-25] 

[111124-10-24] 


16-84 

16-43 

1224 

0 

[113-11] 

[112-10] 

[249] 

1 

[13-11] 

[12-10] 

7[249] 

2 

29[14-13] 

2U[113-123 

211[2S-11] 

3 

37[113-12] 

069[123-11] 

119[26-10] 

4 

62139[226-16] 

3671[123-14] 

3821[126-13] 

5 

9893[216-15] 

2699[116-14] 

18667[136-13] 

6 

299297[1136-17] 

29693[1116-16] 

490288[1137-163 

7 

2511043[11237-16] 

106361[U216-14] 

193[1133-13] 

8 

360131[11236-19] 

624611[11136-18] 

441773[1238-17] 

9 

173497[11237-19] 

884393[11236-18] 

24799[1238-17] 

10 

689[11208-21] 

14113[10236-20] 

2647[1148-19] 

11 

29437[11348-20] 

2771[11138-19] 

677[1249-18] 

12 

3307[11348-23] 

2679[11238-22] 

2707[224-10-21] 

13 

[10249-23] 

23[ll238-22] 

23[224- 10-21] 

14 

[11349-26] 

[11239-24] 

[224-11-23] 


13 •842 

12-44 

8»4 

0 

[139] 

[128] 

[3-10] 

1 

7[139] 

7[128] 

7[3-10] 

2 

227[14-11] 

47[3-103 

236[4-12] 

3 

267[23-10] 

36[39] 

281[13-11] 

4 

97623[126-13] 

319[110*12] 

4177[112-14] 

5 

246[6-13] 

69971[124-12] 

0431111 -U] 

6 

61219[1106-16] 

3151259[1125-14] 

98797[124- 16] 

7 

9479[1026-13] 

34637[1116-12] 

g07[114-14] 

S 

12738433[1237-17] 

12O2061[1126-10] 

37161[216-18] 

9 

46223[1136-17] 

116769[1126-16] 

228603[236-18] 

10 

63693[238-19] 

9841[1125-18] 

60333[236-20] 

11 

1937[1228-18] 

1823[1127-17] 

391[137-19] 

12 

1671[1238-21] 

167[1028-20] 

1103[336-22] 

13 

63[1239-21] 

[1117-20] 

[325-22] 

14 

[1239-23] 

[1129-22] 

[337-24] 


109 


16 12 

[124-11] 
[24-11] 
187[126-13] 
379[13S-12] 
2819[234-16] 
17161[237-16] 
104507[1237-17] 
30317[11237-16] 
431099[11248-19] 
17177[11248-19] 
6613[11249-21] 
1019[1134-10-20] 
991[1234- 10-23]- 
31[1235- 10-23] 
[1236-11-25] 


12 - 8 “ 

[14-10] 

7[l4-10] 

73[14-13] 

667[26-ll] 

'22697[126-14] 

4679[116-14] 

3267831[1137-16] 

22177[1127-14] 

374281[1236-18] 

204907[1238-18] 

4783[1138-20] 

2251[1248-19] 

1277[1339-22] 

61[1349-22] 

[1349-24] 


g!43 

[29] 

7[29] 

9[-ll] 

947[13-10] 

1811[110-13] 

27737[113-13] 

1783141[126-16] 

20627[116-13] 

1772417[226-17] 

547889[226-17] 

161331[226-19] 

127[223-18] 

2329[227-21] 

37[227-21] 

[227-23] 



110 CoMptation of miveml mmenis of tests of statistical normality 


Order 

84^ 

4’’ 

0 

[18] 

in 

1 

7[18] 

nn 

2 

261[2-10] 

269[19] 

3 

1061[12'9] 

77[8] 

4 

182843[113'12] 

2107[M1] 

6 

24391[103‘12j 

1603[Mi] 

6 

127741[104'14] 

29771[3‘13] 

7 

13l3[4'12j 

2609[3*11] 

8 

423697[116*16] 

29771[4-15] 

9 

64827[116'16] 

10O3[3-16] 

10 

11466[106'18] 

2107[4-17] 

11 

47[6'17] 

77[4-16] 

12 

96[106<20] 

269[6-19] 

13 

29[117-20] 

7[0-19] 

14 

[117-22] 

[7-21] 


REFERENCES 

Craig, C. C. (1928). Metrm, 7, 3. 

Fisher, R. A. (1929). Pm, Lord. Math, Soc. (2), 30, 199. 

Fisher, R. A. (1930). froo. Roy. Boc. A, 130, 16. 

Fisher, R, A. & Wishart, J. (1931). Proc. Lord. Math. Boc, (2), 33, 196. 
Geary, R. C. (1933). Bimcirih, 25, 184. 

Hsu, C. T. & Lawmy, D. N. {lUQ),.^imcinka, 31, 238, 

McKay, A. T, (1933). BiomCtrika, 25, 411, 

Pepper, J. (1932). Biomelrika, 24, 66, 

Wishart, j. (1930). Biometriktt, 22, 224. 



[ 111 ] 


THE ASYMPTOTICAL DISTRIBUTION OF RANGE 
IN SAMPLES FROM A NORMAL POPULATION 

By G. ELFVING, Helsingfors 

1. Introductory. Consider a sample of n observations, taken from an infinite normal 
population with the mean 0 and the standard deviation 1. Let a be the smallest and b the 
greatest of the observed values. Then w = b - a is the range of the sample. 

For certain statistical purposes knowledge of the sampling distribution of range is needed. 
The distribution function, however, involves a rather complicated integral, whose exact 
calculation is, for n>2, impossible. Tippett (1926), E. S. Pearson (1926, 1932) and McKay 
& Pearson (1933) have studied and calculated the mean, the standard deviation and the 
Pearson constants of the range. Fitting appropriate Pearson curves to the distribution 

by means of these parameters, Pearson (1932) has computed approximate percentage points 
for it. Later on, Hartley (1942) and Hartley & Pearson (1942) have, by numerical integration, 
tabulated the distribution function for n = 2, ..., 20. 

As pointed out by Pearson, the distribution of range is very sensitive to departures from 
normality in the tails of the parentah distribution. The effect of such departures becoming 
more perceptible for increasing n, the practical importance of the range distribution is, 
perhaps, small for large samples. Nevertheless, it seems to be at least of theoretical interest 
to investigate the asymptotical distribution of range for n->co. This is the purpose of the 
present paper.* The results are summarized in a theorem at the end of the inquiry. 

2. The emct distribution. Transformations. The joint-frequency function of the extremes 
a, b reads, as well known, 

Ab(<*>^) = n{n--l)<l>{a)^{h){0{b)-^{a)Y-'^\ (2-1) 

(cf. e.g. Cramer, 1946, p. 370). Let u = |(a + b) denote the arithmetical mean of the extreme 
values of the sample. Making in (2*1) the transformartion a = u — ^-w, b = u-f^w and 
integrating with respect to u, we find for the frequency function of the range the expression 

f,/{w) = n(n— 1)\ ^{u — ^w)<f{u+^w)[0(u + ^w)-~0(u — ^w)'}'^~^du. (2-2) 

J— 00 

The object of our inquiry is the limiting form of the distribution (2-2). It proves, however, 
more advantageous to pass to the Umit in the joint distribution of a, b or u, w^, before inte- 
grating with respect to u. 

The asymptotical distribution of a and b has been investigated by Fisher & Tippett (1928), 
and Gumbel (1936) (cf. also Cramer, 1946, p. 376). According to these authors, we have 
E(u) = 0, X>(u) = O(log-im), 

F(w) = 2 V(2 log n) + -D(w) = O(log-^n). 

From the formulae quoted it is seen that u->-0, w->oo w probability mn^ao. Our first 
task must, consequently, be a transformation of the variables a, b— or u, w— depending on 
n and intended to stabilize the probability mass, in order to provide a limiting distribution. 

* Prof. H. Wold has kindly directed my attention to this problem. 

t (P(a!) denotes the distribution function and = the frequency function of the normal 
distribution with mean at as = 0 and unit standard deviation. 




112 Asymptotical distribution of range in samples from a normal population 


Following the example of the authors mentioned above, we should have to introduce the 
new variables a' = h'~n'P{ — h). 

For our purpose it proves, however, advantageous to subject a' and b' to a ne-w transforma- 
tion, independent of n, taking 

xe» = 2wcP(a) =2nd>{-iw + u)A 

, V (2'4) 

= 2n0('-h) = 27i<2>(-iw-u).J 


Conversely, 




y - Jlog 


®(a) 


= |log 




(2-5) 


As agb and thus 0(a)-t-?P(-b)^ 1, it follows from (2-4), that x, y are subjected to the 
restrictions x^O, xcoshySw.. (2-6) 


Performing the transformation, we find 

ia(a,6) 


d{x,y) 


2n^<f>{a) <^{b) ’ 


and thus, letting /„(a:,y) denote the joint-frequency function of x, y, 

a ! C 08 h 2/\»-2 




a;^l 


n 




(2-7) 


{ 2 ‘ 8 ) 


This formula is valid in the region (2-6); outside of it, we have to put/„(a:, y) = 0. 

The new variables x , y depend, of course, on u as well as w. It will, however, be shown later, 
that X, for large n, tends to coincide with the variable 

X* = 27i0(--|w), 

which depends exclusively on w. For testing purposes, the former variable may thus, in 
large samples, be used as a substitute for the range. These considerations justify the trans- 
formation (2-4) as well as a closer study of the distribution of x and its limiting form. 

3. Limit passage and remainder term. The limiting form of the joint-frequency function 
(2-8) is immediately seen to be 

f(x,y) = (x^O). (3-1) 

The integral of this function, taken over the whole half-plane a; S 0, is easily seen to equal 1 ; 
(3T) is, consequently, the frequency function of a well-determined two-dimensional dis- 
tribution. 


Let the marginal distribution functions in x, corresponding to (2‘8) and (3'1), be denoted 
by F„{a:) and F{x} respectively. Our next task will be to estimate the remainder 
I F„(a:) - I(x} |, which is, obviously, at most equal to the integral 

= r r2lf,M,V)-f(S,V) 1 (3-2) 

V 0 0 

To begin with, we estimate the quotient fjf upwards. By differentiation with respect 
to the variable z ~ x coshy, this quotient is found to attain the maximum value 



for z = 2, We thus find, for example, 


1 -l 1-0 

n 



fn 

f 



6). 


(3-3) 



G. Elwing 


113 


Tor the further estimations, it proves necessary to divide the domain of integration in (3‘2) 
into an interior and an exterior part by means of a convenient abscissa 1 / = y. In order to 

secure the Maclaurin expansion of log 1 1 — ^ ^ cosh within the interior region, we have to 

SO cosh 2 / 

choose y so as to satisfy the inequality with an appropriate h<\. Taking, for 

simplicity, h — 1 — Vi observing that ooshyge*', we see that the condition mentioned 
is fulfilled if « 

e«/g-(l-Vi). (3-4) 

Now we may estimate /„// downwards in the interior domain of integration. Expanding 

log ^1 — cosh yj, we find 

log j = log|l-^j+^gooshy-^^g^cosh^y ( 0 <^< 1 ). (3-6) 

According to the determination of y, the remainder factor is seen to be < 2 for ^ g a;, y g y. 
Eor w§3, we have logjl — -| > — Omitting, further, the positive term in (3-5) and 
replacing n — 2 by we find 


fii.v) 

hence, combining with (3'3), 

M,7I) 






f +g2oosh**'i? 
n ’ 


-1 




(3-6) 


In the exterior domain of integration, {3'3) directly yields 

\fnii.v)-mv)\<M>v) (3-7) 

We proceed to the estimation of the integral (3'2), denoting its interior and exterior part 
by Ii and respectively. For the former-we have; according to (3'6), the inequality 


L = P r|4-l 2fdidy<- r P(f^+^»cosh27/)e-S«™»,d^dy, 
j 0 Jo \ f njojo 

for the latter, according to (3-7), 

I, = U"2 


(3-8)- 


(3-9) 


The integration with respect to ^ may be explicitly performed. We have, in fact, putting 
for brevity cosh 7] = a, 


r ge-“« dg = ^ {1 - e-«*[l + ax]}, 
Jo a‘ 

/: 


= -7 j 1 — l+aa: + 


(aa;)* 

— I — > , 


6 J 


(3-10) 


(3-11) 


In order to deduce remainder formulas for (a) moderate, ( 6 ) small x, we omit in (3T0) and 
(3- 11 ), (a) all the negative terms, ( 6 ) the terms with x^ and a:®. According to the Maclaurin 


expansion 




Biometrika 34 


8 



114 Asyrn/ptotical distribution of range in samples from a normal population 

the expression in curled brackets in (3-10) is at most equal to Inserting these 

estimations in (3-8), we obtain for the interior integral the inequalities 


^ 15 dy 15., ^15 

^ 2ft J 0 cosh * y “ 2ft ^ 2ft ’ 

(3>12a) 

^ 4ft Jo n ' 

{3-125) 

For the exterior integral, (3" 10) yields 



(3-13) 


Finally, we have to join the results {3‘12) and (3-13). Combining, first, (3- 12a) with (3-13) i 
and determining from (3-4) (taken with the equality sign), we obtain, after some slight 
simpTifioations in the numerical coefficaents, 

\ (nS6). (3- 14a) ' 

n\ n ] 

Combining, on the other hand, (3'12b) with (3-13), we find 

This expression attains, for fixed x and ti, its minimum when y = log^- . For 12, this 
value of y also satisfies (3'4), and we obtain, as a parallel estimate to (3’ 14a), 

4.<~(log^ + i) (n^l2). {3-14fc) 

The formulas (3‘14a, 6) are both valid for all positive x and all n ^ 12. 


4. The asymptotical distribution. Having established the limiting distribution of the 
variable x defined in (2-6), we are going to examine its properties. 

The frequency function of the distribution considered reads, according to (3-1), 


fix) = a:J” ^ a-j" (4-1) 

Changing the order of integration, we easily find the distribution function, the mean and 
the variance of (4-1) to be 


F(x) = 1 



1 + a! coshy 
cosh’* y 


Q-XCOBby^y _ 



1 -f- 

___ 


e~^‘dt, 


(4-1') 


E(x) = \Tt, D*(x) = i-in\ (4-2) 

The numerical evaluation of the distribution is much simplified by the fact that f(x) as 
well as F[x) is closely connected with certain Bessel functions. Denote 


,4.3) 

By differentiation and partial integration, this function is found to satisfy the differential 
equation , 

^"(x) + -^'(x)-f>{x) = 0. (4-4) 



G. Elfving 


116 


Changing x into — ix, we obtain for the function ^(a:) = <f>( — ix) the equation 


f"{x) + -f'{x) + f(x) = 0; 


X 


(4-4') 


hence, ^(x) is a Bessel function of order zero. 

In order to specify this function, we will deduce an asymptotical expression for the 
function (4'3), valid for large x. Bor this purpose, we make in .the latter integral (4'3) the 
substitution ^ = 1 + u/x and write 

+ = (0<^<l). 

Performing the integration, we obtain 

^(x) = 




l + Ol 


(4-6) 


which shows that the Bessel function i^(x) = ^( — ix) tends to zero for a; ->• + i oo. This function 
is, consequently, proportional to the Hankel function (of. Jahnke-Emde, 1909, p. 94). 

Comparing the asymptotical expressions of ^(x) and iH^^\ix), we find the proportional 
factor to be 4/r, whence 

/(x) = a;|^W^). (4-6) 

We proceed to the calculation of F{x). Every integral of xH^^\x) is (cf. Jahnke-Emde, 
p. 166) of the form xH^\x) + Const., where H^\x) is ilae first order Hankel function corre- 
sponding to H§-\x)\ consequently, 




Now —H^\ix) tends to zero as —{\nxfie-=^ for a:->oo (cf. JahnkerEmde, 1909, p. 101); 

2t 


hence C = 1 and 


J(x) = i-x[-|iywx)]. 

Bor small x, F{x) has the expansion 

*'W-(l»g^ + ^)j + (log 


2 5\x* 


where 


log- = 0-11593.... 
7 


(4-7) 


(4-8) 


(4-9) 


The factors of x in (4-6) and (4-7) are tabulated in Jahnke-Emde (1909, pp. 135-6). 
Below, we give a short table of/(x) and F{x). The corresponding curves are seen in Big. 1. 


6. Connexion between the variable x and the range. We now turn back to the original 
object of our inquiry: the asymptotical distribution of the range. 

Consider the variable x = 2w -|- u) <P( — |w — u)] (6- 1 ) 

introduced in (2-4). As mentioned earlier, 

■w->oo, u->0 in probability (to->co). (6-2) 

Under such circumstances, for large n, x may be expected to behave substantially as the 
variable x* = 2«0(-Jw), (6-3) 

which depends exclusively on the raifige. 


8-2 



116 Asymptotical distribution of range in samples from a normal population 

We shall now prove that x*/x-> 1 in probability as n -> co. According to the well-known 
asymptotic formula 

wemay.for |u|<'iw, write , 


■y% 

X 


-luin). 


X 

m 

F{x) 

X 

/(•'«) 

Fix) 

0-0 

O'OOOO 

0-0000 

1-6 

0-3207 

0-6839 

0-1 

0-3427 

0-0146 

2-0 

0-2278 

0-7202 

0-2 

0-3605 

0-0448 

2-6 

0-1659 

0-8163 

0-3 

0-4118 

0-0832 

3-0 

0-1042 

0-8795 

04 

0-4468 

0-1262 

4-0 

0-0446 

0-9601 

0-6 

0-4622 

0-1718 

5-0 

0-0185 

0-9798 

0-6 

0-4666 

0-2183 

6-0 

0-0075 

0-9919 

0-7 

0-4624 

0-2648 

7-0 

0-0030 

0-9968 

0-8 

0-4622 

0-3106 

8-0 

0-0012 

0-9988 

0-9 

0-4380 

0-3662 

9-0 

0-0006 

0-9996 

1-0 

0-4210 

0-3981 

10-0 

0-0002 

0-9998 



Given an arbitrary e > 0, we obviously 


such that 


X* 

X 


-1 


<e 


may find two positive numbers and ( > uj 
if w>«;„ (5-4:) 


On account of {6'2), we may, on the other hand, choose so that the probability of the 
simultaneous vahdity of the latter inequalities in (5'4) exceeds 1 — e if m Consequently, 




> 1 -e 




( 6 - 6 ) 


which proves our statement, 




G. Eluting 


117 


As shown in section 3, the distribution function F^ix) of x converges to F{x) as «--?-co. 
Since F(0) ~ 0, it follows from (6-6), by a well-known method of argument, that the dis- 
tribution function F%(x) of x* converges to the same limiting function. The asymptotical 
distribution of the range, suitably transformed, is hereby established. 

For practical purposes, it would, of course, be desirable to possess a reasonably accurate 
estimate of the remainder F*{x) — F{x), or at least an estimate of the difference F*{x) — Fji(x), 
to be combined with the results (3-14). 

For n =■ 20, the accuracy of F'{x) as substitute for F*{x) may be checked by means of 
Hartley's (1942) tables. The discrepancy amounts to about 0*004 for x = 0*1, 0*026 for x — 1, 
and 0-010 for a: = 4. 

The theoretical evaluation of F*{x) - F(x) seems to be somewhat complicated and, besides, 
of little use since x*, for moat purposes, may be replaced by x. A few remarks concerning 
the relations between x, x* and their distribution functions will, however, be added below. 

To begin with, we note that always x ^ x*, the equality sign being valid only if u = 0. 
Consider, in fact, the function x{u), defined by (6* 1 ) for a fixed w. Inserting for its analytical 
expression, we easily find that Z)<^Uoga;(«) g 0 for all u. Hence, x{u) has no minimum and 
at most one maximum, and the latter is, by symmetry, seen to be attained for u = 0, being 
thus equal to x*. 

From xgx*, it follows that F*(x)^Fn{x) for all x. We will show that the difference 
F„(x) - F*{x) may be expressed as a double integral. 

The variables u and w are, according to (2*4), well-determined 
•functions of x and y in the region (2*6); and so is the variable x*, 
on account of (6*3). 

On the level curve x* = Xq, w has a constant value Wg, determined 

2n<5(-i«)o) == »o> 

and this curve is, consequently, given in parametric form by the 
equations 

x = 2n^[0(~^Wo + u)0{-^Wo-u)l 2/ = ilog^|— (5*7) 

where u runs through all values from — oo to -f-oo. The latter 
function (6*7) being, obviously, monotonously increasing, we may 
imagine u eliminated, writing (6*7) in the form 

* = ^ni^o, y) ( - 00 < y < oo). (6-7') 

From the proof of the inequality x<x* given above, it follows 
that the function (6*7') has a single maximum for y = 0. When 
y-^ ± 00, the function obviously tends to zero. 

The inequality x* g oJq is fulfilled on the left side of the curve (6*7'), the inequality x^x^ 
on the left side of the straight line X = Let us for brevity denote the regions (cf. fig. 2) 



OgXgU*o.y). L(»;.y)<Xga:o (6*8) 

by A^^{xa) and JS„(a:o) respectively. The difference FJx,^) - F^[Xf^) is, then, the probability 
of the points x, y falling within the region H,j(a;o). Dropping the indices 0, we thus obtain 
the expression sought for r c 






LkiMArj. 


(6*9) 



118 Asymptotical distribution of range in samples from a normal population 

Compaiing, finally, the transformed range distribution function F^tx) directly with its 
limiting form F[x), we find 

F*[x) - F{x) = [F^{x) - F{x)] ~ [F„{x) - F*{x)] 


= ff {fn-f)^^dy-^! Ldgdy 

JJi£x JJ B„(x) 

= ff (A-f)dSdy-f[ fd^dy. (S-IO) 

4/ J J J Sn{^) 


The former integral is, obviously, at most equal to the remainder expression in (3-2), 
estimated in (3T4), 


6. Conclusion. Our main results may be summarized in the following theorem; 

Theoeem. Consider a sample of n observations from an infinite normal population with 
mean 0 and standard deviation 1 . Let a be the smallest, b the greatest of the observed values, 
and put 

X = 2n^J[0{a) — b)], x* = 2n0 



the lattef variable being evidently a simple transformation of the range of the sample. Then 

(1) xgx*; x*/x->l in probability (w->oo). 

(2) The distribution fuactioae 'F^(x) and F^(x) of x and x* tend, for n-^oo, to the 

common fimit . 

where HP{z) is the first order Bessel function, which vanishes as — e®'* for z -5-+ i 00. 

\2i/ 

(3) For n ^ 12, F^x) satisfies the inequalities 


\FAx)~F(x) 


n\ n ) 


\FAx)-F{x)\ 


4*^ 




7. Oeneralization. A great part of our conclusions does not presuppose the normality 
of the parental population. Thus, the distribution (2’8) of the variables x, y defined by (2’6) 
is the same for any continuous probability law and so; consequently, is its limiting form; 
however, if the parental distribution is non-symnjetrical, with distribution function G(x), 
say, the factor <?( — b) in (2-5) must, of course, be replaced by 1 — ©(b) instead of G( — b), 
and the variable x* is to be defined by 

X* = 2nf{0{-iw)[l-Q{^yv)]}. 

The proof of the statement x*/x->-l requires, however, convenient assumptions con- 
cerning the parental distribution. It can be proved that the assertion mentioned — and, 
consequently, the theorem stated above: — are valid if the frequency function of this dis- 
tribution is of the form 

9(x) = (7exp|^-l|a:|»J, 

where l<pg2. 



G. Blfving 


119 


REFEBEN-CE^ 

Cbameb, H. (1948). Mathemdti<Ml Methods of Statiatiea. TJppaala. 

PiSHBB., R- A. & Tippett, L. H. C. (1928). Lifnifcmg forms of the frequency distribution of the largest 
or smallest member of a sample. Proa. Oamb. Phil. Soc, 24, 180. 

GtnMBEL, E. J. (1936). Lee valem's extremes des distribution atatistiques. Arm. Inst. Poineare, 5, 
116-68. 

HAlt.Tl.EY, H. O.. (1942). The range in random samples. Riometrifca, 32, 334-48. 

Habtuey, H. O. & Peajison, B. S. (1942). The probability inte^al of the range in samples of n observa- 
tions from a normal population. Biometrika, 32, 301—10. 

Jahnkb, E. & Emde, F. (1909). Funktionentafeln. Leipzig and Berlin. 

MoKay, A. T. & Pbabson, E. S. (1933). A note on the distribution of range in samples of n. Biometrika, 
2S, 416-20. 

Pbabson, E. S. (1926). A further note on the distribution of range in samples, taken from a norrhal 
population. Biometrika, 18, 173—94. 

Peabsost, E. S. (1932). The percentage limits for the distribution of range in samples from a normal 
population. Biometrika, 24, 404^17. 

Tippett, L. H. G. (1926). On the extreme individuals and the range of samples taken from a normal 
population. Biometrika, 17, 364^-87. 



[ 120 J 


LIMITS OF THE RATIO OP MEAN RANGE TO 
STANDARD DEVIATION* 

By R. L. PLACKETT, BA. 

The ratio of mean range in samples of n to population standard deviation cr, which 
has been denoted by is used in control chart work (when the population is assumed 
normal) to estimate cr from the ranges of a set of small samples. On comparing the series of 
values of for different n when the parent population is rectangular with the series when 
it is normal (see table below), it is clear that for 12 the two series agree to within less 
than 10%. With this in mind, the question arises; what are the limiting values of for 
a given n'i It is shown here that populations exist for which is arbitrarily near to zero, 
while for no population will d^ exceed the value 




We consider a population whose distribution function is F{x) and which extends from 
-a to +a so that F{-a) = 0 and F{a) = 1. The population in the first place may have 
any finite limits, but there is no loss in generality in supposing these. It is required to find 
limits to the ratio 

r [l-F^-{l~F)‘^]dx ( 1 ) 

d = Lii 

We apply the calculus of variations and find the extremes of d,^ in the class of functions F 
such that J'( - a) = 0 and F{a) = 1; the case is thus one of fixed end-points. Suppose that 
F(x) = u(x) gives an extreme value and form the functions F{x) = u{x) + tv{x); for t suitably 
near to zero, all these will be permissible distribution functions, i.e. monotonically increasing, 
provided v{-a) = v(a) = 0. Then for < = 0, djdt{dn) is zero for all functions v(x). 

I \\ — {u-{-tvY'—{l — u — tvY''\dx 

Since ^ — ^ , 

Q .x\u' + tv')dx-^j x(u' + tv')dxj'J 

{dt " J(=o 


f fa f ra \ 2-]| 

2 J' xVAk-I 


ra ra 

Now xh dx = — 2\ XV dx since v{a) = w( — a) = 0, and by the same condition 

Jo, J — a 

j xv'dic = — [ vdx. 

J-a J -a 

Commrinication from the National Physical Laboratory. 



R. L. Rlackbtt 


121 


The numerator now becomes of the form f s{x)v{x)dx, and this must be zero for all 

J — 

functions v(x); it is therefore concluded that s(x) is identically equal to zero. In fact 
n x^u'dx—l xu'dx] [(1 — — 

_ J -a \ J / J 

= j^J {1— (1-M)"}rfa; J - a; J , 

so that if fi is the mean, tr the standard deviation, the mean rangq in samples of n, and 
F{x) the distribution function of the population which gives an extreme value to we have 

Wj, {x — fi) = n(T\F'^~^ “ ( 1 — . 

Put X = —a and obtain + x = a gives = w^^{a — ii) whence /i = 0 and 

= ( 2 ) 

T his distribution must give an upper limit to since if we consider a distribution of the 
tjq)e below: 

Area(l — 2i/) F(x) = [ix+\)y — 4:^a;<0 





1 

Area y 

Area y 



+ i 


F{x)=^\~{\-%x)y 0<x^\ 
2/=Si 


the ratio (1) for y = is approximately //(.3/2)%,Ji/ which can be made as small as we 

please. 

Reverting therefore to equation (2) we note that since a«5„ = d„(max.) = ncr/a. 



0-2= r xHF-il'' xdF't 


J -a \j -a / 


J —a 

Therefore 


i.e. 

d,, (max.) -nj , {(2k, 2) ! [(k, 1) j 


(3) 


It is of interest to note that all the foregoing analysis may be carried out with a equal to 
any finite value and so we may take the hmit as a oo , and equation (3 ) , which is independent 
of a, will still hold., 

It is easy to verify, by Stirling’s formula or otherwise, that as n increases [(w— 1) !]2 
becomes neghgible compared with (2n — 2) !. 

Consequently, for large n, d^{ma,x.)^n ^{2j{2n - 1)} 


''■2-1/m)’ 


i.e. dJmax.)=V(»^+|). 

The probability density function of (2) is obtained by differentiation and is 

1 


/(a:) = 


a(7i.- 1) [F”-^ + (1 - ’ 


(4) 

(5) 



122 Limits of the ratio of mean range to standard deviation 

so that (2) and (6) are the parametric equations of the curve in terms of its distribution 

O^l-— 3 1 

function.Thusforw > 2,/(0) = ^ and/( ± a) = . The distributions (2) are readily 

seen to be unimodal and symmetrical about a: = 0. For w = 2, 3 they are rectangular. For 

iC * 1 

and large n, Hence considerations for 

F < I show that for large ra and a: 0, 

a; I {n— 1)* 

From (4), cr~a^w. Consequently, for any finite o, as u,->-oo the distributions (2) tend to 
a single ordinate at * = 0. This should be compared with the limiting case giving 
for fixed % illustrated with the diagram above. The limiting form of the two distributions is 
the same but the approach to the limit with increasing n is quite different. There is no 
approach to normality. 

Following is a table of d„(max.) and of in samples frorh normal and rectangular popula- 
tions for w = 2, ..., 12. The quantity V(n + i) is also included to see how closely (4) is 
approximated. The values of d,, (normal) are obtained from the paper by E. S. Pearson 
( 1 942) . For a rectangular distribution is simply 2 — 1 )/(w -I- 1 ). 


)X 


dn (max.) 

d„ (normal) 

d„ (rectangular) 

2 

1-58114 

M6470 

1-128 

1-16470 

3 

1-87083 

1-73205 

1-693 

1-73206 

4 

2-12132 

2-08396 

2-059 

2-07846 

5 

2-34521 

2-34013 

2-326 

2-30940 

6 

2-64951 

2-66333 

2-634 

2-47436 

7 

2-73861 

2-74414 

2-704 

2-69808 

8 

2-91548 

2-92076 

2-847 

2-69430 

9 

3-08221 

3-08685 

2-970 

2-77128 

10 

3-24037 

3-24440 

. 3-078 

2-83426 

11 

3-39116 

3-39466 

3173 

2-88675 

12 

3-53563 

3-63860 

3-258 

2-93116 


Some values of d,, for a number of symmetrical populations were given by Pearson 
& Adyanthaya (1928) and have been reproduced with some figures for one shew population 
in Tables for Statisticians and Biometricians, Part II, Table XXIII. The majority of these 
values were obtained empirically from random sampling experiments. These values were 
of course subject to sampling error and for this reason are in three cases very slightly 
above d„(max.). 


Some of the preceding work was done as part of the Research and Development programme 
of the Ministry of Supply (S.R. 17) and appears by permission of the Chief Scientific Officer. 
It was completed as part of the research programme of the National Physical Laboratory, 
and this paper is pubhahed by permission of the Director of the Laboratory. 

REFERENCES 

Peabsok, E. S. (1942). The probability integral of the range in samples of n observations from 
a normal population. Biometrika, 32, 301-10. 

Peabson, E. S. & Adyanthaya, N. K. (1928). The distrihutiou of frequency constants in small samples 
from symmetrical populations. Biometrika, 20A, 368-60. 








[ 123 ] 


SIGNIFICANCE TESTS FOR 2x2 TABLES 
By G. a. BARNARD, Imperial College 
Part I 

The theory of statistical significance tests deals with abstractions of experimental results. 
The fact that the figures dealt with may happen to be tensile strengths of iron bars, or 
perhaps weights of babies, is ignored in the carrying out of the test; and for the purpose of 
statistical theory the experiment in question could just as well be represented by an experi- 
ment involving the drawing of balls from urns. In fact, it is an advantage, from some points 
of view, to replace the concrete experiment involved in a particular practical case by an 
‘abstract’ urn-experiment, in order to retain in view only those features of the case which 
can be dealt with by statistical methods. 

It is obvious enough that the first step in the statistical treatment of an experimental 
result may be represented as the replacement of the concrete experiment by an ‘urn- 
experiment’; but the implications of this have not always had the continuous attention they 
deserve. Once the abstract picture has been formed, the analysis of it is largely a matter 
of pure mathematics. What distinguishes the statistician from the pure mathematician, in 
this connexion, should be the statistician’s ability to form valid abstract pictures of concrete 
cases, and his clear recognition of the limits of validity of his abstract pictures. Yet we find 
relatively little discussion in statistical text-books of the process of formation .of these 
abstract pictures. 

It is the purpose of the first part of this paper to draw attention to the confusion which 
may arise through the possible formation of several different abstract pictures, each of 
which may apply to some concrete cases, though not to others. 

Suppose we are given two mass-production processes, A and B, and we wish to test 
whether process A and process B are equally satisfactory, in the sense that neither process 
is more likely to produce defective items than the other. For this purpose we take, say, 
m articles made by process A, and n made by process B, and test them, under suitable con- 
ditions. We find that a out of the m articles are defective, while 6 out of the n articles are 
defective, a result which can be represented in the form of a 2 x 2 table (Table 1), 


Table 1 



I (defective) 

II (non-defective) 

Total 

Process A 

a 

0 

m 

Process B 

b 

d 

n 

Total 

T 

8 

N 


The statistical analysis of results of this type has been much discussed, but it seems to 
have escaped notice that, on the facts incompletely stated as above, it is possible to form 
several different abstract pictures, any one of which might be appropriate to the real case 
in question. The adoption of one picture rather than another wifi, depend, in a given case, 
on further knowledge which is not specified above. 




124 


Significance, tests for 2x2 tables 


The basis of Fisher's ‘exact' test 

The current generally accepted test for results of the above type is that given by Fisher 
(1941), or some approximation to it. The simplest abstract picture* to which this test corre- 
sponds would seem to be one in which the m articles made by process A and the n articles 
made by process B are represented by N similar balls, m of them marked A and n marked B. 
The N balls are put into an urn, and then withdrawn in random order. As they are withdrawn, 
the balls are placed, in order, in a row of N receptacles, r of which have been marked ‘I’, 
the remainder being marked ‘II’. The result of Table 1 then represents the observation that 
a of the balls marked A are in receptacles marked ‘I’. The probability of such a result, in 
such an experiment is m\n\r\s\ 

N\a\b\c\d\ 


which can be seen by considering that the contents of the r receptacles marked ‘ I’ form a 
sample of r from an urn containing m balls marked A and n balls marked B, the sampling 
being done without replacement. The probability (1), added to those of all results less 
probable than that obtained, is the basis of Fisher’s test. 

In the concrete case given, the N balls, initially similar, may be taken to correspond with 
the N items of raw materials. The process of labelling the balls A and B corresponds to the 
selection of m of the items of raw material, and their fabrication into articles by process A, 
and the fabrication of the n remaining ones by process B. The N receptacles into which the 
balls are eventually placed then represent the N ‘test occasions’ which must be provided 
for when the experiment is laid out. The fact that these receptacles are labelled ‘I’ or ‘II’ 
before the balls are placed in them corresponds to the assumption of the hypothesis being 
tested — that the processes do not differ in respect of liability to defectives, so that whether 
or not a given article is defective has nothing to do with whether it is ^ or 5. The labelling 
‘I’ or ‘II’ is thus assumed independent of the labelling of the balls. Finally, the random 
allocation of balls to receptacles corresponds to a precaution which might have been taken in 
the concrete case, viz. the random order of test of the article secured by the use of random 
numbers or the like. 


The basis of the 0.8. M. test 

Another abstract picture, also applicable to the concrete case as incompletely described 
above, forms the basis of the test to be developed in the later part of this paper, which we 
have called the C.S ,M. test. In this picture, the two processes A and B, are represented by two 
urns, A and B, each urn containing a large number of balls, some of which are marked ‘I’, 
while the others are marked ‘ II ’ . The selection for test of m articles of process A is represented 
by the random drawing of m balls from urn A ; and similarly for the n articles of process B. 
The test procedure corresponds to the examination of the balls, to see whether they are 
marked I’ or ‘II’. The liability of process A to produce defectives is represented by the 
proportion of balls marked ‘I’ in urn A, while p;, similarly represents the liability of 
process B. The hypothesis we wish to test says that = = p, say. The probability of a 

result such as that of Table 1 is very nearly 


m! 

a!c! 


Pa{^-PaT^ 


n\ 

b\d\ 


F6(1 -Pb) 


d 


( 2 ) 


Though not the only possible one. By following Fisher’s argument, as given in his book, one can 
construct a more complicated picture which leaite to a RiTT.ila.r result. 



G. A. Barnard 


125 


which, on the hypothesis tested, becomes 




We may notice that the expression (3) differs from (1) by a factor 


r!«! 




( 3 ) 


and it would have been obtained in the earlier case if we had assumed that the labelling of 
the receptacles was itself done randomly, by selection of N labels from a box containing a 
large number of labels, the proportion marked ‘ I ’ being p. 

To justify the application of our second picture to a concrete case, we should have to be 
satisfied that the conditions of process A and those of process B were sufficiently stable, in 
a statistical sense, to justify the formation of the notions corresponding to p^^ and p^. We 
should further have to make sure that our selection of samples of m and n respectively was 
for practical purposes random. And finally, we should have to be reasonably sure that the 
conditions of test themselves had practically no influence on the results of the test — that the 
test used revealed a real property of the article tested, rather than a property of the in- 
dividual conditions of test. 


Another type of abstract experiment 

Another case of common occurrence may be represented by a single urn, containing balls 
each of which carries two marks — one mark being either A or JB, the other mark being 
either ‘I’ or ‘II The experiment consists in drawing N balls from the urn, at random, and 
examining their markings. If the proportion of balls marked ‘ A I ’ is while p^^, p^^ 
similarly represent- the proportions of the other markings in the urn, the probability asso- 
ciated with Table 1 in this case is 

jyi 

by the multinomial theorem, provided the number of balls in the turn is large. In this case 
the hypothesis tested, that the markings ‘I’ and ‘II’ on the one hand, and the markings A 
and B on the other, are independent, may be put in the form 

PalPbi ~ PazPbl 

and, assuming that {Pai+Pa 2 )=P' and iPbi+Pbi) == ^-P'> and {pai+Pbi) = P and 
iPaz+Pbi) = ^—P> do not vanish, the probability of our result, on the hypothesis tested, 
can be expressed as 


which differs from (3) by a factor 


Nl 
m\ n\ 


(jj')m (1 


This shows that (5) is related to (3) in much the same way as (3) is related to (1). 

This situation could present itself in our concrete case if the articles made by the two 
processes A and B were mixed up together in a common store, and the test sample of N 
were randomly drawn from this store, the subsequent conditions being as in the second case. 
Statisticians with industrial experience may perhaps feel it is unlikely that the experiment 



X26 Significance tests for 2x2 tables 

would be performed in this way; but it must be admitted that it could have been. Cases 
such as this seem to occur more frequently in biometric investigations, where a population 
of animals is being tested for the association or otherwise of two characters. 

Nomenclature 

The name ‘double dichotomy’ has been applied generally to all experiments leading to 
results of the form of Table 1 , but the foregoing analysis would suggest that it might be more 
appropriate to restrict this term to the third case we have indicated. Since the second case 
can be obtained from the third by supposing the numbers of articles made by process A 
and by process B to be fixed, we might then call the second case the (singly) restricted double 
dichotomy. Similarly, the first case would be called the doubly restricted double dichotomy. 
Such a nomenclature, apart from a lack of euphony, would be open to the objection that it 
would tend to imply that the third case was the general one, the first two being derivatives 
of it. This, in turn; would imply that the subject-matter of our investigation in cases one and 
two was in reality a four-fold universe, the restrictions on numbers being merely matters of 
experimental technique. But such is not always the case. The question implied in our second 
case presupposes two two-fold populations, which are to be compared, and no four-fold 
super-population need exist for this question to have meaning. 

We therefore propose the names ‘ double dichotomy ’ for the third case, ‘2x2 comparative 
trial’ for the second case, and ‘2x2 independence trial’ for the first case, though here again 
an objection on aesthetic grounds would be easy to sustain. 

Finer distinctions 

In principle it could be maintained that there is a distinction between the 2x2 compara- 
tive trial, as instanced above, and a restricted double dichotomy. As we have said, the funda- 
mental subject-matter of a 2 x 2 comparative trial is a pair of populations ; while the subject- 
matter of a restricted double dichotomy is a four-fold population from which we happen, by 
an accident of experimental technique, to be able to extract samples in which the numbers 
of items having certain characteristics are fixed. The latter case could arise, for example, 
if an attempt was being made to discover association between colour of eyes in school- 
children and some less easily identified characteristic, such as membership of a particular 
blood-group. We could imagine that an experimenter might pick out m children with (say) 
blue eyes, and n without blue eyes, and then, having obtained his samples, he might subject 
them to a test for blood-group. The conclusions drawn from such an experiment would 
presumably be intended to apply to the population of school-children, a four-fold one relative 
to the two characteristics in question. The distinction between the two cases comes out if 
we consider what happened if, in the 2x2 comparative trial, all items tested turn out to be 
defective. In this case we should say that our question, whether or not, tends to be 

answered in the affirmative. In the case of the school-children, if they all turn out to have 
the same blood-group, then no conclusion on our question about the four-fold population 
can be drawn at all. 

Similar distinctions apply to the 2x2 independence trial. In the psycho-physical experi- 
ment described by Fisher (1942), where the point at issue is whether or not a lady can tell 
whether the milk or the tea has been put in the cup first, no statistical population is pre- 
supposed. The question would have meaning even if we refused to regard the order of in- 
sertion of milk or tea as ever being a matter of chance, while at the same time we regarded 



G. A. Barward 


127 


the lady’s guess as equally determinate. The ‘statistical population’ enters into this experi- 
ment only in the experimental technique, via the randomization procedure used to fix the 
order of presentation of cups; it does not enter into the question being asked. In this case, 
the extreme result, in which in fact the milk was put in first every time, while the lady 
guessed every time that it was otherwise, would be taken as evidence against the lady’s 
claim. But such a result could by itself have no meaning for the question asked in the case 
of a restricted 2x2 trial or a doubly restricted double dichotomy. 

Further types of experimental procedure leading to results expressible in the form of 
Table 1 are the various sequential procedures that hrCve been described for deciding questions 
of the kind we have been discussing (3, 4). Yet another procedure is one where the conditions 
of trial vary from one block of tests to another — as when an open-air trial runs over several 
days of inconstant weather. Here we might suppose there were k pairs of urns, (Ai, 

(Aa, Ba), .... (Aj,, Bj,). The distinctions here are, however, obvious enough, and they are 
worth noting only in order to emphasize that the mere fact there results are presented in the 
form of Table 1 is not in itself sufficient to specify an appropriate test of significance. 

Part II 

The significance test for the 2x2 trial 

Roughly speaking, the object of a significance test as applied to results of the type con- 
sidered, is to answer the question: Can these results be ascribed to ‘chance’? In this form, 
the question is not sufficiently precise. If our ‘urn ihodel’ for the 2x2 comparative trial is 
adequate to represent the experiment actually carried out, then the results will in any case 
be ‘ due to chance ’ , in some sense. What we wish to know in this case is whether a particular 
kind of chance- — namely, one in which Pa = Pb= P — can be said to account for our results. 
If the. results are such that this explanation of them is untenable, then we may conclude 
either, that our particular ‘ urn model’ of the experiment is inadequate anyway, or we may 
.retain the model, and conclude that and pj, must be unequal. In most cases, of course, 
we shall reach the latter conclusion, since we would not have made up the urn model in 
question unless we had some reasons for believing in its adequacy, but it is well to bear in 
mind the first alternative, in case a re-examhiation of the circumstances may make us change 
our minds. A point very strongly emphasized by Fisher in his book The Design of Experi- 
ments is, that we ought to have in mind a particular ‘urn model’ before the experiment is 
performed, and arrange the conduct of the experiment so that the adequacy of this urn 
model is not likely to be questioned afterwards. 

With the qualifications indicated, we can say that the object of the significance test we 
propose to develop is, to enable a particular class of explanations of our experimental results 
to be ruled out as untenable. Specifically, given results like those of Table 1, we want to be 
able to say that they could not be accounted for by supposing that the experiment we 
actually performed was analogous to the urn experiment with two urns in which = Pb = P- 
This raises the question, in what sense could such a supposition fail to account for the 
observed results ? Any result of the form of Table 1 could arise in an experiment of this kind, 
when our supposition is true. Why, then, should we select some results of this form and say 
they are incompatible with our supposition? 

In the last analysis, this question cannot be answered without an examination of what is 
meant in general bv statements involving probabilities, a point which is still the subject of 



128 Significance tests for 2x2 tables 

controversy. But in our particular case (if not in all cases) we can avoid giving a general 
answer to the question of what probability is, by considering the practical circumstances 
which form the setting for our particular problem, and the uses to which we propose to put 
the answer. In fact, in our case we are interested in the equality or otherwise of andp^ 
because we want to decide which of the two processes, A and B, is to be preferred, from the 
point of view of defectives produced. To say thatp„ is greater than pt, will mean, for us, that 
process B is preferable, and conversely if pt, is greater than while to say that p^ and 
are. equal will mean that there is nothing to choose between the two processes. In fact, to 
say that p„ = pj,, in our case, means that, if process A and process B are both used, then it 
will be found that the frequencies with which defectives appear in the two processes will, 
for practical purposes, be equal.* Thus we shall assert that results in which the observed 
frequencies, ajm and bjn, differ widely, are incompatible with the supposition that p„ = p^; 
in doing so, we shall be neglecting as impossible a class of events which are in reality logically 
possible, but whose probability is small. The precise formulation of a test of significance 
then reduces to a precise formulation of what is meant by a ‘ wide difference ’ in the fre- 
quencies ajm and bjn, and to an evaluation of the probability of those events which are being 
neglected as impossible. 

The lattice diagram 

If we consider the first problem, of arranging results like those of Table 1 in order of the 
relative ‘ width ’ of the differences they indicate, a first step is the enumeration of aU possible 
results in a convenient form. 

Logically, we should begin by noting that Table 1 is really an abbreviated version of the 
results of any one particular experiment, which will to start with be like those of Table 2 
(where we have taken m = 8, n = 6, for definiteness). 


Table 2 

Urn: AAAAAAAABBBBBB 

Mark: n I II II I n II II I I II I I I 

But if, as we are presupposing, our urn analogy is adequate to represent the conditions of 
the experiment, the order in which the results were obtained must be irrelevant to the 
interpretation of results. If the conditions of trial varied during the course of the experiment, 
this assumption might not be correct — ^for example, if the trial were an open-air trial, and 
it began to rain half-way through. We are assuming that the urn analogy is adequate, and 
. so We must treat all results like Table 2 which give the same values to a, b, o, d, in Table 1, 
as equivalent. Table 1 therefore stands for m\n\la\h\c\d\ distinct, but equivalent, results 
which we shall not distinguish from now on. 

If we now take rectangular -axes in a plane, we can represent Table 1 by the point whose 
coordinates are (a, b). Thus ‘ x ’ in Tig. 1 represents the set of results equivalent, in the sense 
of the previous paragraph, to the results of Table 2. At the same time, all possible results 
of the experiinent which gave rise to Table 2 are represented by the points of the rectangle 

Wo hope that the qualifications we have attached to our statements will be sufficient to guard us 
against the accusation that we have adopted in full a ‘frequency theory ’ of probability. The frequency 
m erpreta ion is relevant to oiir particular problem; other problems may involve other interpretations. 
More than one interpretation may be relevant in a single problem. 



G. A. Babnabd 


129 


PQRS. We call this representation of possible results the lattice diagram.* Our problem 
may now be regarded as one of ordering the points of the lattice diagram according to the 
‘width’ of the difference they indicate. 

Conditions S and G 

In trying to make the idea of ‘width’ of difference precise, we are up against difficulties 
similar to those attaching to the interpretation of results on the basis of incomplete infor- 
mation about the circumstances of the experiment. The information given at first was 
compatible with several distinct ‘urn models’. Similarly, the information given now is 
compatible with several different notions about ‘width’ of difference. We may be concerned 
with the arithmetical size of the difference Pa.—Pb> ""hth the ratio PalPb> ■"’’ith the 
logarithm of this ratio, or with some more complicated function. 

Logically, therefore, we should expect to set up various tests, based on various ideas of 
what constitutes ‘width ’ of difference in probability or frequency (in Neyman and Pearson’s 


6 

5 

4 

3 

2 

1 

0 


01234667 


. R 


Q 

8 


Fig. 1 


Table 3 



I 

n 

Total 

A 

0 

a 

m 

B 

d 

b 

n 

Total 

a 

T 

N 


language, corresponding to various weight functions over the space of alternatives to the 
hypothesis tested). But here a factor which may simply be described as laziness enters in. 
If we carried our ideas to their logical conclusion, we should find ourselves constructing a 
new test for almost every new experiment we had to deal with; and the time and effort 
involved in this are too great. Consequently, we confine our attempt to producing a test 
which will be reasonably applicable to a wide class of cases of the type specified, without 
suggesting that this test is unique, or ‘best possible’. 

First, then, in our ordering of points in the lattice diagram, we propose that the same rank 
should be given to the point ( (m — a), (n — b)) as to the point (a, 6). This condition we propose 
to call the ‘symmetry condition’, or ‘condition S’. It amounts to saying, that if Table 1 
is to be considered as indicating a real difference between and pj, then so is Table 3, in 
which the labels ‘I’ and ‘II’ have been interchanged. If, when we are testing whether 

* Not the sample space of Neyman and Pearson. In the sample space, different results equivalent 
to Table 2 are represented by different points. 

Biometrika 34 9 




130 


Significance tests for 2x2 tables 

Pa = Pb> testing whether 1-Pa== '^-Pm same point of view, 

then this symmetry condition is clearly justified.* 

Next, we propose that in our ordering, the two points which, respectively, have the same 
abscissa or the same ordinate as (a, 6), and which lie further from the diagonal PB, shall he 
considered as indicating wider differences than (a,h) itself. Thus, referring to Fig. 1, the 
points immediately above and immediately to the left of the point ‘x’ are reckoned to 
indicate wider differences than the point ‘x’ itself. This condition implies that the set of 
points indicating differences as wide or wider than (a, b) will have a shape property vaguely 
related to convexity, and we call it the ‘0 condition’. It means that if we consider the 
table corresponding to Table 2, with cell frequencies 

2 6 
5 1 

as significant evidence of difference, then we must also consider the tables 

1 7 and 2 b 

6 1 (5 0 

as significant evidence of difference. It is difficult to imagine circumstances where this 
would not be so, 

Geometrically, condition S implies that we can in future restrict our considerations to 
points in the lattice diagram lying on or above the diagonal PR, i.e. in the triangle PB8. 
And condition C implies that, in this triangle, our ‘ width of difference ’ must increase as 
we go upwards or to the left. If horizontal and vertical axes are taken at any point X in this 
triangle, points in the second quadrant are associated with a wider difference than X is, 
points in the fourth quadrant are associated vnth narrower differences than X is. The relative 
width of differences associated with points in the first and third quadrants (excluding the 
axes) are now determined by the conditions G and S. The ordering generated by these 
conditions is thus a partial, not a total, ordering; it is, in fact, a kind of conical order, in the 
sense of A. A. Robb. We must introduce some further condition to make the ordering total. 


Probability considerations 

In many simpler cases, it is possible to distinguish those events which are considered 
incompatible with a given probability hypothesis by their relatively low probability, 
compared with other possible events. Such a simple comparison of probabilities is not open 
to us in this case, because to each point (a, b) we have, on the hypothesis tested, associated 
a function 


F(ci,6;p) = 


m\n\ 


p'il-pY 


a\b\c\dV 

which contains the ‘nuisance parameter’ p If we consider the relative position, in our 
ordering, of another point, {a' ,b'), we have to consider the inequality 


W{a,h-,p)<W{a\h'-,p), (6) 

the truth or falsehood, of which depends, in general, on the unknown y>; and there is nothing 
in the statement of the problem, nor in the experimental method, to justify any particular 
choice for the value of p. 


* Cases wtere Pa>Pi> i® impossible are hereby neglected, strictly. 



G. A. Barnabd 


131 


If {a + b) — {a' + b'), the validity or otherwise of the inequalitj^ (6) is independent of p. 
Thus, using this inequality as a criterion for ordering our points, we can ^ay that in the 
triangle PR8, the ‘width of difference’ must increase as we move north-west. But this is 
all that can be derived from this criterion, and it is clearly even less helpful in ordering the 
points than the conditions 0 and 8 are, Moreover, if we recall that each point (a, b) in the 
lattice diagram really represents a set of m! n\ja\ b\ c! d\ distinct results, each with probability 
p'il-pY, the criterion (6) loses its plausibility. 

We might try to improve the situation by associating the function TF(a,6;j)).with a 
number, depending on a and b only. For fixed a and 6, this number would be a functional 
of W{a,b;ji]. We should clearly require that, if the inequality (6) is true for all p, then the 
corresponding mequality should be true of the numbers associated with W(a,b-,p} and 
W{a',b'', p). The simplest functionals which satisfy this condition will be the mean value, 


the maximum value 
and one single value 


w{a,b) 


W(a,b;p)dp, 


w'(a,b) = maxW(a,b;p), 

0<p<l 

w"(a,b) = W{a,b;pu). 


Circumstances could be imagined in which any of these three criteria might produce 
reasonable tests of significance. For example, in certain genetical experiments we may have 
reason to suppose that the value p = 1/3 would occur more often than any other value. In 
such a case we might use w", withpo =1/3. But for general purposes taking Po = 1/3 could 
not be justified. 

We might again argue that taking w as our criterion would correspond with the assumption 
that all values of p were a priori equally likely. But some would say that such an assumption 
was never justified; while those who would admit the assumption would in strictness do so 
only if we really did know nothing about the value of p. And in the general circumstances we 
are trying to cater for, we may sometimes know something vague about the value of p — 
such as, for example, that p will be less than 
Neyman and Pearson have shown that the likelihood ratio, which in our case comes to be 


a^h^’C’d^N^ 


very often gives a good basis for ordering experimental results. We feel, however, that the 
criterion we shall describe in the next section has a slightly more direct justification than 
the likelihood ratio, though the choice, is, admittedly, largely a matter of taste. 


The maximum condition 

Before setting out the final condition which, with conditions 8 and C, will be used even- 
tually to arrange the points of the lattice diagram in order of ‘relative width of difference 
indicated’, we need to consider the' assignment of significance levels to various results. 

When we say that a given result is not significant on, say, the 6 % level, we mean that 
such a result, or one indicating a wider difference, could occur, with probability at least 0‘06, 
even when p„ = pj. We could believe in a theory that p„ = p^, without having to suppose 
that an event belonging to a class whose, joint probability was less than 0*06 had occurred. 
Conversely, if a result is judged significant on the 5 % level, it means that no theory which 



132 Significance tests for 2x2 tables 

assumed that — Pb could account for the result obtained without supposing that an event 
of a type whose probability was less than 0-06 had occurred on the occasion in question. 

Let us now consider a specific case, in which we choose numbers which in practice would 
be ridiculously small in order to save arithmetic. Suppose, in fact, m = n = 2, while a = 2 
and 6 = 0. It follows from conditions S and G alone that in judging the significance of such 
a result we need consider only the probability of this result, together with its converse, in 
which a = 0, 6 = 2. lipa = Pb~ P> ^^e probability of results of this type is 

P = 2p^(l-pf, 

Now suppose that we are prepared to discard as untenable theories which require us to sup- 
pose that events of probability less than 0-06 had occurred. In such a case, we should discard 
a theory which supposed = O-l, since in this case P = 2(0*1)® (0-9)* = 0-0162, less 

than 0*05. But we could not discard a theory which supposed Pa — Pb — since in this 
.caseP = 0-12.5. Infaot, our resultwould enable us to discard all theories involvingp„=p^=p, 
except those for which p lay in the interval 0-197 <p < 0-803. In particular practical cases 
we might be prepared, on grounds external to the experiment in question, to dismiss the 
possibility that p should lie in this interval; and in such cases we should be entitled, to say 
that the result excludes the possibility that p„ = Pi,. 

It is easy to see that the above specific case is typical. Any set of points in the lattice 
diagram, considered by some criterion agreeing with conditions S and C to indicate differ- 
ences as wide or wider than those of a given result, will be associated with a probability P, 
on the assumption Pa = Pb = V', and this P will be a function of p, rising from zero when 
p = 0 to a maximum in the neighbourhood of p = ^, and then falling again symmetrically 
(by the 8 condition) to zero again at p = 1 , somewhat as in Fig. 2. The given result by itself 



will exclude the possibility p„ = p,, altogether, only if the significance level adopted is greater 
than P„„ the maximum value of P. If our significance level corresponds , to a probability 
less than P,„, then all we can say is, that our result is incompatible with p„ = p;, unless their 
common value lies in a certain subset of the range (0, 1). We may or may not exclude these 
latter possibilities on other grounds. 

In trying to construct our test, however, we have set ourselves the task of evaluating the 
evidence provided 6y our eaipenmeni alone in relation to the hypothesis P(j = pj. It now appears 
that this is impossible so long as we restrict ourselves to the form, usual in such oases, of a 
simple statement that a given result is, or is not, significant on a given level. We have two 
alternatives. Either we can fiiM an entirely new form of statement to convey what we wish 
to express; or we can adhere to the form of statement, and try to make the situation fit the 
form as nearly as possible. Perhaps the day will come when experimenters do not require 
answers in the form of numbers, when they are suflficiently versed in generalized mathe- 
matical analysis to be content with a function (such as the function P(p)), instead of a single 



G. A. Babnabd 


133 


number, But we have not yet reached this stage; and so we propose to take up the latter 
alternative, and try to make, the situation fit the standard form of statement of significance 
tests as nearly as possible.* 

Our difficulty arises from the dependence of P on p. If the graph of P against p were a 
horizontal straight line, our difficulty would be overcome. What we propose, therefore, is 
to try to make the graph of P against p as near to a horizontal line as possible, by suitably 
adapting our idea of what is meant by ‘width of difference’. In making this adaptation, we 
shall secure that we do not violate the common-sense requirements as to the meaning of the 
term ‘width of difference’, by requiring that conditions G and S should always be satisfied. 

The maximum condition 

The condition C requires that, of all points in the triangle PR8, that indicating the 
‘ widest difference ’ must be the point 8 at the corner (Pig . 1 ) . The function P associated with 
this point and its converse, Q, which we may denote as P(0, 6; p), is 

P(0, 6; p) ■= p®(l — p)® +p®(l — p)® 
and the maximum P„j occurs here when p — \, where we have 

P„,(0, 6) = l/2i» == 1-22 X 10-h 

The condition C requires that the only points which might be considered as coming next 
after (S', in order of decreasing ‘width of difference’ are (1, 6) and (0, 5). We have to adopt 
some principle to choose between these two. 

If (1, 6) were taken next after (0, 6), the function P associated with it would be 

P'(l,6;p) = P(0,6;p) + 16p’(l-p)’ 

and 6) would come to 9/2’-® = 10-97 x 10“*. On the other hand, if (0, 6) were chosen 
next, instead of (1, 6), we should have 

P(0,6;p) = P(0,6;p)-|-6[p9(l-p)®,-fp5(l-p)9] 

and P„,(0, 5) would come to 8-58 x 10"^ the maximum occurring when p = i + 

Thus P„i(0, 5) is smaller than P,„(l, 6), and this lower maximum is associated with a flatter 
curve of P(0, 6; p). Since a flat curve is our aim (the horizontal line being the ideal), we 
choose (0, 5) as the point to come next after (0, 6), rather than (1,6). 

Having chosen (0, 5) as the next ‘widest difference’ point, the C condition restricts us to 
the points (1,6), and (0, 4), as candidates for the next position. We consequently compare 

P(l, 6; p) = P(0, 5; p)-|- 16p''(l -p)’ 
with P"((), 4;p) = P(0,5;p)+ 16[p®(l— p)i°-i-p^®(l— p)*] 

and the lower value of P„, as criterion shows that (1, 6) is now to be taken. At the next stage, 
we shall have to compare the functions associated with (0, 4), (1, 5) and (2, 6). In this way 
we can arrange the points of the lattice diagram in order, step by step. 

The principle involved, which we caU the ‘maximum condition’, may be formally stated 
as follows; 

Considering only points for which ajm is less than bjn, if the first [n~ 1) points (aj, b^), 
..., (a„_i, 6,i_i), in order of decreasing ‘width of difference’ have been chosen, and 

* In the example just taken we might make a kind of ‘conditional confidence interval statement’, 
that, if p existed, we should have O' 197 <j}< 0-803 with confidence coefficient 0-96. 



j[34 Significance tests for 2x2 tables 

i® associated with the function p), then the rath point, (a„,6„) is 

that point, of all points (a, b) permitted by the G condition, for which 

bn—ii P) + 51 gj (jj j — PY P^Q- 

is least. (a.„, b^) is then associated with the function 

AM I ty\\ 

P(an, bnl P) =-- -PK-1. bn-l> P) + j -PY +^>'’(1 -pYl 

To complete the specification of the ordering, we have to legislate for the case where there 
are several points giving the same value of PJ^a, b), this value being less than that associated 
with any other permissible point. In this case we lay down that all such points are to be 
given the same rank, and the second term in the expression for P{a^, b^\ p) is to be replaced 
by the corresponding sum over all these points. If there are h such points at any stage, then 
the next point after them will be denoted as the {n + i)th point in the ordering. This requires, 
for example, when m = n, that the points (a, b) and (6, a) are always to be taken together. 

Finally, the significance level to be attached to the point (a„, b^) will be 

Pm{a,u bn) = max P(a„, p). 

0<p<l 

This guarantees that our teat will be a ‘valid’ one, in the sense that, if we judge a result 
incompatible with the hypothesis p^ = p*; on a given level of significance, then all the 
possibilities of the formp„ = pi, are excluded, to the given level. Thus no further information, 
external to the experiment in question, could make us decide that a result judged significant 
by our test was not in fact so (holding, of course, to a fixed significance level); on the other 
hand, we still have the possibility that other information may lead us to consider as signi- 
ficant results which appear in themselves not to be so. The formulation of our maximum 
condition is made so as to minimize this latter possibility. Our test is thus conservative, in 
the sense that we do not draw the conclusion q= pf, unless this is certainly warranted by 
the data; but it might be called ‘ progressive conservative ’, because, of all such conservative 
tests, it will be the least conservative. 


Another aspect of the Tnaximum condition 

When the author first approached the problem of analysis of experimental results of the 
type now considered, he did so from the point of view of regarding the significance level to 
be used as being fixed in advance, say at the 6 % level. From this point of view, the problem 
of constructing a test resolved itself, not into one of ordering the points in the lattice diagram, 
but into one of choosing a region, or set of points in the lattice diagram, such that any point 
belonging to this region could be regarded as evidence of inequality of p„ and p ^ , on the given 
level of significance. The condition of symmetry required that such a region should consist 
of two similar parts, one above the diagonal and one below it. The condition G required 
that the part of the region lying above the diagonal PR should be so shaped that if a point 
X belonged to the region, then so would all points lying north or west of X. There remained 
the problem, to decide which of the many regions satisfying these two conditions should be 
the one adopted, 

To settle this, to any such region R we can associate a function 


P(R-,p) 


2 

(o,6)eBa!b!c!dI 





G. A. Babnaed 


135 


and such a region will give a ‘valid’ test of significance provided that 

Max P{R‘, 'p) < 0'05. 

Q<p<l 

There will not be so many regions satisfying this validity condition as well as the conditions 
8 and C. We proposed, therefore, to select that region from among these, which had the 
greatest number of points in it. This last condition was what we then called the ‘maximum 
condition’. The fact that this region would not be unique in cases where m = n was taken 
care of by requiring a subsidiary symmetry condition that in such cases (a, b) and (b, a) 
should always be taken together. 

What we have now adopted as the ‘ maximum condition ’ can be seen to be related to this 
earher version, by the consideration that, roughly speaking, apart from effects due to the 
discreteness of the lattice diagram, holding the number of points in the region constant, and 
then choosing the region which gives the lowest value for as we do now, comes to the same 
thing as holding P^ constant, and then choosing the region to have the maximum nurnber 
of points. 

Other things being equal, the ‘ power ’ of a test, in the sense of Neyman and Pearson, will 
increase with the ‘ volume ’ of the rejection region chosen. In this sense we can say, roughly, 
that the maximum condition secures that our teat should be as powerful as possible, con- 
sistent with validity. 

Practical forrmdation of the test 

Some statistical tests (such as that due to Fisher, already mentioned), can be carried out 
in the form of a direct calculation from the data, without reference to any special tables. 
Most other tests require the use of special tables which, however, are for the most part tables 
of single or double entry, perhaps triple entry, if the level of significance is regarded as a 
variable. In our case, regarding the level of significance as a variable, a table of quadruple 
entry would be required. 

Ideally, a set of tables, one for each pair of values of m and n (m > n) would be required. 
The table would be in the shape of a right-angled triangle, corresponding to the triangle 
PRS of Fig. 1, and divided into squares, each square corresponding to given values of a and b. 
Within each square (a, 6) would then appear a number, the value of 6). This value of 
jP„,(a,6) then is the maximum probability of obtaining the result (a, 6), or one indicating a 
wider difference, if p^^ = pj,. A comparison of P„i{a, b) with the significance level adopted will 
then decide the.significance or otherwise of our result. In any particular case we shall be able 
to see which tables, in the sense of our test, are regarded as indieating a wider difference, by 
noting which points are associated with lower values of Pm(a, b). 

In practice, it will be impossible to construct such tables for a large range of values of m 
and n. But for larger values of m and n, a test based on a normal approximation to the dis- 
tributions involved will be quite adequate for practical purposes. In fact, the test we have 
proposed will itself approximate, in some sense, to a test based on the normal distribution, 
though we do not enter into a detailed discussion of the relationship between the two tests 
here.’'' Tables are thus required for our test only for small values of m and n. In spite of 
advice by statisticians to the contrary, such small values of m and n continue to occur 
frequently in practice. 

* The general question of the sense in whieh tests are regarded as ‘asymptotically approaching’ 
normal tests is a subject for another paper. Professor Pearson’s paper which follows, bears on this point. 



136 Significance tests for 2x2 taibles 

In the Appendix we give specimen tables for the cases -where JV = 14. The comparative 
figures for the Tisher test, also given in the Appendix, indicate that the- differences between 
the two tests are appreciable. An exploration is now under way into larger values of m and n, 
and it is hoped to report on this in due course. 


Other applications of the C.S,M. procedure 

We have spoken of our test as the O.S.M. test, as if the case dealt with above were the only 

case to which the procedure adopted was applicable. But similar methods could be used 

in many other oases. In particular, a method closely following the one we have used might 

be applied to the case we have called the double dichotomy, which differs from the 2x2 

comparative trial in that two ‘nuisance parameters p and p' are present, instead of only 

one. The 2-dimensional lattice diagram of the 2x2 trial is replaced by a 3-dimensional 

regular tetrahedron of points with homogeneous coordinates (a,6,c,(i}, connected by the 

relation , , j at 

a + b + c + d = N. 


Two opposite edges of this tetrahedron correspond to wi = 0 and n = 0, and sections of the 
tetrahedron by planes parallel to these edges will look exactly hke lattice diagrams for the 
2x2 case and within these sections, relative probabilities will behave just as in the 2x2 case. 
An examination of the possibilities, however, indicates that not much is to be gained by a 
detailed treatment. The C.S.M. test for 2 x 2 comparative trials will be a valid test if applied 
to double dichotomies. It will err somewhat on ‘the side of ‘conservatism’, but the error 
does not appear to be large, except when the numbers involved are exceedingly small. 

It is with a view to further applications of the approach used in this paper that we have 
retained the C condition as a separate requirement, although it is easy to see that it could 
be absorbed into the M condition as we have given it. 


In writing this paper the author has had great personal help and encouragement from 
Prof. E. S. Pearson, to whom he wishes to express his very deep thanks. 


S-DMMAB.Y 

In Part I we discuss various types of experiment, each of which may give rise to results 
in the form of a 2 x 2 table. It appears that significance tests which may be appropriate 
for one type of experiment will not necessarily be appropriate for another. 

In Part II a test jg developed for experiments of the type called ‘2x2 comparative 
trials’. 


APPENDIX 
Tables for the QSM test 

Three tables are given below to illustrate the application of the ideas given in the main paper 
to the construction of a test for 2x2 comparative trials. The cases covered are pairs of 
samples, sizes (7, 7), (8, 6), and (9, 5). The small figures in brackets in the (7, 7) table gives 
significance levels on Eishex’s ‘ exact ’ test for 2x2 independence trials, for comparison. 
Only half of the (8, 6) and (9, 5) tables are given; the missing parts can be filled in by 
symmetry. The following examples show the meaning and use of the tables: 



0. A. Babnarb 


137 


Example 1. Two boxes, each containing a large number of components, are to be tested 
for comparative quality measured by the respective proportions of defective components 
they contain. Two samples, each of seven components, are taken, at random, one from each 
box. One sample gives four defectives, the other, none. What is the significance of this result, 
in relation to the hypothesis that the boxes have the same quality? 

Answer. Entering the (7, 7) table at the point (0, 4), we find the number 2’4. This means 
that the result is evidence against the h3q)othesis, on the 2-4 % level of significance. 





Table for m = 

n = 1 




7 

0-012 

0-18 

0-70 

2-4 

7-5 

20 






(O'Osa) 

(0'2‘3) 

(2-1) 

(7 0) 

(19) 

(4B) 



6 

0-18 

1-3 

6-7 

13 

— 

- — . 






(0-23) 

(2-9) 

(10) 

(27) 





S 

0-70 

5-7 

21 

— 

— 

— . 

— 

20 


(2-1) 

(10) 

(29) 





(46) 

4 

2-4 

13 

— 

— 

— 

— 

— 

7-6 


(7-0) 

(27) 






(19) 

3 

7-5 

— 

— 

— 

— 

— 

13 

2-4 


(19) 






(27) 

(7-0) 

2 

20 

— 

— 

— 

— 

21 

5-7 

0-70 


(IS) 





(29) 

(10) 

(2-1) 

1 

— 

— 

— 

— 

13 

5-7 

1-3 

0-18 






(27) 

(10) 

{2'9) 

(0-23) 

0 

— 

— 

20 

7-5 

2-4 

0-70 

0-18 

0-012 




(46) 

(19) 

(7-0) 

(2-1) 

(0-23) 

(0-058) 


0 

1 

2 

3 

4 

5 

6 

7 


More precisely, what is asserted is, that the maximum probability of getting a result not 
less significant than that obtained, is 0-024. And the results which are not less significant 
are those which correspond to points in the table with numbers not greater than 2-4, viz. 
(0, 4), (7, 3), (0, 5), (7, 2), (0, 6), (7, 1), (0, 7), (7, 0), (1, 6), (6, 1), (1, 7), (6, 0), (2, 7), (6, 0), 
(3, 7), (4, 0). By suitable choice of the proportion defective, we could construct a pair of 
boxes, of equal quality, which would give samples falling in this group 24 times out of 1000, 
on the average; but we could not, by any choice of proportion defective, retain equal quality 
and yet have results in this group more often than 24 times in 1000. 


Table for m = 8, n = 6 



0 

1 

2 

3 

4 

5 

6 

7 

8 



6 

0-012 

0-18 

0-71 

2-5 

5*3 

13 

— 

— 

— 

— 


5 

0-085 

1-3 

6-6 

11 

— 

— 

— 

— 

— 

— 

5 

4 

0-44 

3-9 

19 

— 

— 

— 

— 

— 

— 

— 

4 

3 

1-9 

.16 

— 

— 

— 

— 

— 

— . 

— 

6-3 

3 

2 

8-0 

— 

— 

. — 

— 

— 

— 

20 

3-8 

0-86 

2 

1 

23 

— 

— 

— 

— 

— 

14 

7-4 

1-3 

0-13 

1 

0 

—1 

— 

— 

16 

10 

6’3 

2-3 

0-62 

0-19 

0-012 

0 


0 

1 

2 

3 

4 

6 

6 

7 

8 

9 



Table for m = 9, n = 5 


Example 2. The situation is as before, except that the first sample has nine components, 
none of them defective, while the second sample has five components, four of them defective. 




[ 139 ] 


THE CHOICE OF STATISTICAL TESTS ILLUSTRATED ON THE 
INTERPRETATION OF DATA CLASSED IN A 2 x 2 TABLE 

By E. S. PEARSON 

CONTENTS 

PAGE 


(i) Introductory . 139 

(ii) The choice of statistical tests .142 

(iii) Application of this, approach to the analysis of data classed in a 2 x 2 table 144 

(iv) Problem I 144 

(v) Problem II ............. . 147 

(vi) Solution of Problem II, using the normal approximation . . . .161 

(vii) The classical approach to Problem II 167 

(viii) Problem III ............ 168 

(ix) General comment . . . . . . . . . . .160 

References ............. 163 

Appendix . 164 


(i) ISTTBOBtrOTOBY 

1. The problem of testing the significance of a difference between two proportions is one 
which receives early attention in text-books on mathematical statistics, and it might be 
thought to be one of the questions whose final solution lies behind us. It is a problem whose 
simplicity makes it easy to examine the logical cogency of the methods put forward for its 
solution, but, on examination, it is evident that they have not yet been rounded off satis- 
factorily. The origin of the present paper lies partly in an investigation commenced in 1938 
and discussed at the time in College lectures, and partly in recent correspondence in Nature 
in which G. A. Barnard (1945a, b) and R. A. Eisher (1946a) have taken part.* This 
correspondence has suggested that in a problem of such apparent simplicity, starting from 
different premises, it is possible to reach what may sometimes be very different numerical 
probability figures by which to judge significance, 

2. Such a difference in levels of significance in the solution of an everyday problem is 
obviously puzzling to the users of statistical methods who are accustomed to accept the 
technique as an established procedure and have not the opportunity for a critical examina- 
tion of the conditions under which probability theory is brought to bear as a guide to action. 
For the question here at issue is a fundamental one of why and how our judgement is in- 
fluenced by the calculation of a probability, and the dilemma raised by the Barnard-Eisher 
correspondence can only be answered in terms of our views on the practical function of the 
theory. We may all agree that in practice we use probabihty figures derived from an analysis 
of numerical data to help us to make up our minds on the next step, whether in experi- 
mental research or executive action. But what form of presentation of the probability set-up 
is likely to result in the greater number of sound decisions is likely to be always a matter for 
differences of opinion. 

3. All that I can do is to approach the problem of the 2x2 table from the viewpoint 
which appears most helpful to me. In the preceding paper Mr Barnard has elaborated the 

* There was also an earlier discussion on the same subject between E. B. Wilson (1941, 1942) and 
R. A. Fisher (1941). 



240 Choice of statistical tests 

views expressed in his letters to Nature. Such discussion is, I believe, desirable, even though 
controversial issues are raised. Por the value of the whole elaborate structure of the 
modern theory of mathematical statistics depends at bast in part on the sense in which the 
individual statistician appreciates the meaning of the probability model he is using when 
drawing the practical conclusions from his analysis of data. I have used the words ‘in part’, 
for it is true that the analytical process of applying the statistical technique to experi- 
mental data may in itself be enormously illuminating even without paying any close regard 
to a final probability figure. Such is the case, for example, with the technique of analysis of 
variance, where the mere process of breaking up a total sum of squares into parts with whioh 
different sources of variability can be associated, brings with it a reward in clear thinking 
even without the application of a probability test. 

4. There is a very wide variety in the types of situation in which probability theory is 
introduced to help in reaching a decision as to further action. 

(A) At one extreme we have the case whei>e repeated decisions must be made on results 
obtained from some routine procedure carried out under controlled conditions. 

(B) At the other is the situation where statistical tools are applied to an isolated investiga- 
tion of considerable importance in which many of the issues involved in the conclusion can 
hardly be assessed in numerical terms. 

5. Two situations of this kind, in which the statistical technique involved is that of testing 
the significance of a difference between two proportions, may be illustrated from problems 
arising in the ‘proof’ of armour-piercing shot or shell. 

6. Example of type A. In the proof of small anti-tanlc, armour-piercing shot it might be 
decided to set aside, as a standard, a batch of shot whose quality has been established by 
special trials; against this standard, later batches can be compared. The variable measured 
is the proportion of shot which fail to perforate a plate of specified thickness when fired with 
a given striking velocity. The use of standard shot is necessary for calibration purposes, 
because there are inevitable changes in toughness from one proof plate to another and only 
a limited number of shot can be fired at a single plate. Then the situation might be summed 
up as follows;* 

Aim of proof. To ensure that as few batches as possible are passed into service which 
are less effective than the standard. 

Method of proof. Twelve rounds of the standard and twelve of the batch under test to be 
fired, round for round, against a single test plate and a record kept of the number of failures 
in each group, say a and b. 

Routine sentencirig rule. This should lay down a ready means of determining, from a 
knowledge of a and b, whether to class the new batch as inferior to the standard or not. 

Assumptions accepted in using rule. That the two samples of twelve shot have each been 
randomly selected from the much larger batches. That against the particular plate used, a 
proportion p^ of the standard and pj of the new batch would fail to give satisfactory per- 
foration at the specified striking velocity. That while pj andpg would be diflferent for other 
plates, if p 2 >Pi for one plate, it will be so for all other plates. The objective is to segregate 
batches of shot for which pg > p^. 

* It hM been somewhat simplified for illustrative purposes, e.g. complete control of the striking 
velocity IS not in practice possibloi 



E. S. Peabsost 


141 


7. Example of type B. Two types of heavy armour-piercing naval shell of the same 
calibre are under consideration; they may be of different design or made by different firms. 
Since the cost of producing and testing a single round of this kind runs into many hundreds 
of pounds, the investigation is a costly one, yet the issues involved are far reaching. Twelve 
shells of one kind and eight of the other have been fired; two of the former and five of the 
latter failed to perforate the plate. In what way can a statistical test contribute to the 
decision which must be taken on further action? 

8. In dealing with Example A the guiding principle followed in seeking help from the 
theory of probability can be very simple. We can set as our object a rule which: 

(i) will result in an increasing chance of detecting that^^a >Pv larger the difference; 

(ii) will leave only a small chance of segregating thenew batch wrongly when, infaot,p 2 

Diagrammatically the rule would consist in segregating the new batch when the point (a, b) 

falls within some such area as that shown shaded in 
Fig. 1. In this problem involving a routine pro- 
cedure, it is the long-run frequency of different con- 
sequences of the proof sentencing which is of 
importance, and probability theory is introduced to 
provide a measure of expected frequency. This 
method of introducing the theory of probability into 
this proof problem is not necessarily the only one b 
that could be adopted in fixing a routine procedure, 
but it is a simple one and, since simplicity has the 
merit of appealing to the user’s understanding, it has 
great advantages. 

9. When dealing with Example B a very con- 
siderable number of factors must be weighed in 
the balance, and the result of a statistical test of 
significance could never be the over-riding one. 

There will be other information as to the effect of changes in shell design, possibly from 
shell of different calibre; information as to the uniformity in quality of output of the 
firm or firms concerned; questions of cost and of general policy. He would be a bold man who 
would attempt to express these in numerical terms. Whereas when tackhng problem A 
it is easy to convince the practical man of the value of a probability construct related to 
frequency of occurrence, in problem B the argument that ‘if we were to repeatedly do so 
and so, such and such result would follow in the long run’ is at once met by the common- 
sense answer that we never should carry out a precisely similar trial again. 

10. Nevertheless, it is clear that the scientist with a knowledge of statistical method 
behind him can make his contribution to a round-table discussion, provided he has 
acquired a grasp of the practical issues. Starting from the basis that individual shell wiU 
never be identical in armour-piercing quahties, however good the control of production, 
he has to consider how much of the difference between (i) two failures out of twelve and 
(ii) five failures out of eight is likely to be due to this inevitable variability. There miay be a 
number of ways of sizing up the position involving different assumptions or hypothetical 
constructs; he may follow one or several of these, The value of his advice is dependent almost 





^^2 Choice of statistical tests 

entirely on the soundness of his scientific judgement, and very little on whether his back- 
room calculations have been based on inverse or direct probability or on an appeal to 
fiducial argument. 

1 1 How far, then, can one go in giving precision to a philosophy of statistical inference ? 
It seems clear that in certain problems probability theory is of value because of its close 
relation to frequency of occurrence; such seems to be the case for my Example A. Tests can 
be built up to satisfy the practical requirements in this field. In other and, no doubt, more 
numerous cases there is no repetition of the same type of trial or experiment, but all the 
same we can and many of us do use the same test rules to guide our decision, following the 
analysis of an isolated set of numerical data. Why do we do this ? What are the springs of 
decision! Is it because the formulation of the case in terms of hypothetical repetition helps 
to that clarity of view needed for sound judgement! Or is it because we are content that the 
application of a rule, now in this investigation, now in that, should result in a long-run 
frequency of errors in judgement which we control at a low figure! On this I should not care 
to dogmatize, realizing how difficult it is to analyse the reasons governing even one’s own 
personal decisions. 

12. That the frequency concept is not generally accepted in the interpretation of statis- 

tical tests is of course well known. With his characteristic forcefulness R. A. Fisher (1945b) 
has recently written: ‘In recent times one often repeated exposition of the tests of signi- 
ficance, by J. Neyman, a writer not closely associated with the development of these tests, 
seems liable to lead mathematical readers astray, through laying down axiomatically, what 
is not agreed or generally true, that the level of significance must be equal to the frequency 
with which the hypothesis is rejected in repeated sampling of any fixed population allowed 
by hypothesis. This intrusive axiom, which is foreign to the reasoning on which the tests of 
significance were in fact based seems to be a real bar to progress ’ 

13. But the subject of criticism seems to me less an intrusive mathematical axiom than 
a mathematical formulation of a practical requirement which statisticians of many schools 
of thought have deliberately advanced. Prof. Fisher’s contributions to the development of 
tests of significance have been outstanding, but such tests, if under another name, were 
discovered before his day and are being derived far and wide to meet new needs. To claim 
what seems to amount to patent rights over their interpretation can hardly be his serious 
intention. Many of us, as statisticians, fall into the all too easy habit of making authoritative 
statements as to how probability theory .should be used as a guide to judgement, but 
ultimately it is likely that the method of application which finds greatest favour will be that 
which through its simplicity and directness appeals most to the common scientific user’s 
understanding. Hitherto the us^er has been accustomed to accept the function of probability 
theory laid down by the mathematicians; but it would be good if he could take a larger share 
in formulating himself what are the practical requirements that the theory should satisfy 
in application. 


(ii) The choice oe statistical tests 

14. One approach to follow in determining tests to be applied to the 2x2 class of problem 
follows the lines that Neyman and I have adopted since 1928 in dealing with tests of statis- 
tical hypotheses. Let me first recapitulate in broad terms the steps in that approach when 
applied to a problem where the universe of possible observations can be represented by a 



E. S. P® ARSON 


143 


finite set of discrete points. A test of signifi,cance may be described as a method of analysis 
of statistical data which helps us to discriminate between alternative theories or hypotheses. 
In order to make use of the theory of probability in the sense here understood, a random 
process must either have been purposely introduced .or be assumed to have been present in 
the collection of data; then the hypothesis very often concerns the values of parameters 
contained in the probability laws which, in the conceptual sphere, form, the mathematical 
counterpart of the sampling distributions of experience. 

15. We proceed by setting up a specific hypothesis to test, in Neyman’s and my 
terminology, the null hypothesis in R. A. Fisher’s. At the same time, in choosing the test, we 
take into account alternatives to which we believe possible or at any rate consider it 
most important to be on the look out for. Thus we wish the test to have maximum dis- 
criminating power within a certain class of hypotheses. Three steps in constructing the test 
may be defined: 

diep 1 . We must first specify the set of results which could follow on repeated application 
of the random process used in the collection of the data; this may be termed the experi- 
mental probability set. 

Step 2. We then divide this set by a system of ordered boundaries or contours such that 
as we pass across one boundary and proceed to the next, we come to a class of results which 
makes us more and more inclined, on the information available, to reject the hypothesis 
tested in favour of alternatives which differ from it by increasing amounts. 

Step 3. We then, if possible, associate with each contour level the chance that, if is 
true, a result will occur in random sampling lying beyond that level. 

This rather crude statement of procedure will be developed in more detail in discussing 
the problems that arise in connexion with the 2x2 table. 

16. Notes on these points, (a) Step 1. This involves the definition of what Neyman and 
I have termed the sample space, W. The application in three forms of the 2x2 problem is 
discussed in paragraphs 19, 27 and 46 below. 

(b) Step 2. For a given hypothesis under test there may be a number of ways of deriving 
a system of contours, and only in certain cases can there be said to be complete agreement 
on which is the ‘ best ’ . Practical expediency will often carry weight in the choice. It is widely 
accepted that the choice cannot be made without paying regard to the admissible hypotheses 
alternative to H„, whether this process is given formal precision or taken as a broad guide. 
In our first papers (Neyman & Pearson, 1928 a, 6) we suggested that the likelihood ratio 
criterion. A, was a very useful one to employ in determining a family of contours which 
would be ordered in relation to our confidence in the hypothesis tested when set against 
the background of admissible alternatives. Thus Step 2 preceded Step 3. In later papers 
(Neyman & Pearson, 1933, 1936 and 1938) we started with a fixed value for the chanCe, e, 
of Step 3 and determined the associated contour, taking account of what we termed the 
power of a test with regard to the alternative hypotheses. The family of Step 2 followed 
on giving decreasing values to e. However, although the mathematical procedure may 
put Step 3 before 2, we cannot put this into operation before we have decided, under 
Step 2, on the guiding principle to be used in choosing the contour system. That is why 
I have numbered the steps in this order. 

(fi) Step 3. If this can be accomplished, we have what Neynlan and I called control of the 
‘ 1st kind of error ’ . In problems where, as below, we are concerned with discrete rather than 



2 ^^ Choice of statistical tests 

continuous probability distributions (e.g. for the binomial, the Poisson, the multinomial 
and the hypergeometric distributions), this objective cannot always be achieved, and it 
may be necessary to be satisfied with a knowledge of an upper limit of the chance of rejecting 
the hypothesis tested when it is true. 

(iii) Applioation os' this appboaoh to the analysis of data classed in a 2 x2 table 
17. The frequencies of the data in the table may he defined in the following notation: 

Table 1 



Col. 1 

Col. 2 

Total 

Bow 1 

a 

c 

m 

Bow 2 

h 

d 

n 

Total 

r 

8 

N 


If we follow in turn the steps defined above to determine the method of interpretation of 
such data, the requirements of the appropriate tests are seen to follow very simply, although 
raatheniatical or computational difficulties arise in implementing them. On taking Step 1 
we can separate out at once the three types of problem which Barnard has differentiated;* 
these I shall call Problems I, II and III. They are distinguished by the sample space having 
1 , 2 and 3 dimensions respectively. From the mathematical point of view it might seem more 
logical to take them in the reverse order, adding first one and then a second restriction to 
the 3-dimensioned case of Problem III. For a simple exposition, I think the leverse procedure 
of building up from I to III is preferable and this has been adopted in the following sections. 

(iv) Problem I 

18. This may be described as the test of the significance of the difference between two 
treatments after these have been randomly assigned to a group of iV^ = m + n individuals 
(Barnard terms it the 2x2 independence trial). To use the terminology of a particular 
application, we may say that we are observing the presence or absence of ‘reaction X’. 
The first treatment is applied to m and the second to n of the N individuals; as a result ajm 
and bjn show reaction X. 

19. In this case the random process has been applied within the group of N individuals, 
and its repetition would simply involve other random reassignments of the two treatments 
among the N. No assumption is made as to how the N individuals were selected from some 
larger universe. The repetition may be h 3 q)othetioal, in the sense that it often could not 
takeplace, e.g. if reaction Z = death. Indeed, repetition under the same essential conditions 
is frequently impossible in practice. But this correspondence between the frequency of 
results upon hypothetical repetition and the probability distribution of the counterpart 
mathematical model forms an accepted part of the process of reasoning whereby (following 

* Statisticians had, of course, all been more or less conscious of these differences, but, at any rate 
in my own case, it was discussion with Mr Barnard which made it easy to see the problem in its full 
clarity. 




E. S. Pearson 


146 


the present approach) we use probability theory as a basis for inference. The hypothesis 
tested is that while some individuals show reaction X. and some do not, the result would be 
the same whichever treatment were applied as far as these N individuals are concerned. Thus, 
on the null hypothesis, there are r = a + b individuals who will react and s = c + d who will 
not, whatever the assignment of treatments. 


20. The chance that a will react in m and & = r — a in to is, therefore, if the hypothesis 
be true, 

nr iTir i m[n\r\s\ ,,, 

This expression is proportional to the coefficient of in the hypergeometric series 

F{a,p,y,x) = F{—r, —m,n-r+ I,.-!:). (2) 


Thus, taking m's^n,a can assume values of 

(i) 0, 1, ...,r if r<TO, 

(ii) r — TO, r-TO+1, ..., r if n<r^m, 

(hi) r-TO, r-TO+l, ..., m if r>m. 

For this probability distribution, it is known (K, Pearson (1899) and Kendall (1943, p. 127)) 



that Meana = -^, (3) 

Variance of a = eg = jj- W 

21. For the particular case 

N - 20, f = 7, TO = 12, TO = 8, 

the terms in the distribution of | 20, 7, 12} are shown as ordinates in Fig. 2 and given in 
the accompanying Table 2. The experimental probability set consists of the eight alter- 
native values for a, viz . 0 , 1 , . . . , 7 with which the probabilities tabled are associated if Pq is 
true. Further 


Biometrika 34 


Meana = a = 4-2, = 1-0721. 


10 


(5)- 



Choice of statistical tests 

22. Next consider step 2. The purpose of the investigation is to test the hypothesis 
that the diiference between a/12 and (r-a)/8 has resulted simply from a random 
partition of 20 individuals, of whom r will show reaction X in whichever treatment 
group they are included. The experiment gives r = 7. The contour levels fall betw-een 
the 8 points of the set as shown in Fig. 2; the further a lies towards the right, the more 
inclined we shall be to accept the alternative hypothesis that a/12> (r-a)/8 because 
treatment 1 is more effective than treatment 2. The further a lies to the left, the more 
we shall incline towards the reverse alternative. To complete Step 3, we have only to 
calculate the sums of the tail terms of the hypergeometrio series, as shown in Table 2 for 
the special case. 


Table 2. Problem I. Chances for special case N = 20, r = 1 , m = 12, if Hg is true 


a 

Chance 
of a 

Chance of a or less 

True value 

Normal approx. 

0 

0-0001 

0-000 

0-000 

1 

0-0043 

0-004 

0-006 

2 

0-0477 

0-062 

0-066 

3 

0-1987 

0-261 

0-267 

4 

0-3676 


— 



Chance of a or more 



True value 

Normal approx. 

5 

0-2861 

0-392 

0-390 

6 

0-0954 

0-106 

0-113 

7 

0-0102 

0-010 

0-016 


23. Having set up the machinery of the test, we come to the practical question. Beyond 
which contour levels must a fall before we infer that there is a treatment difference ? Not, 
T think, in the example, if a were 3 , 4 or 5 ; possibly if a = 6 , more probably ifa = 2 and almost 
certainly if o = 0, 1 or 7. Were we to fix as critical levels those between a = 1 and 2 on the 
one hand, and between a = 6 and 7 on the other, then we should be guided in our decision 
by the following knowledge: if there were no treatment difference, so that seven out of the 
twenty individuals would have shown reaction X whichever treatment were applied, then 
the chance under random assignment of treatments that a < 2 or > 6 is only 0-014 or 1 in 70. 
Had we taken the critical levels between 2 and 3 and between 6 and 7, the corresponding 
chance would he 0-062 or 1 in 16, This summing up in terms of probability helps towards 
the balanced decision on the next practical step to be taken, because it helps us to assess the 
extent of purely chance fluctuations that are possible. It may be assumed that in a matter 
of importance we should never be content with a single experiment applied to twenty in- 
dividuals; but the result of applying the statistical test with its answer in terms of the chance 
of a mistaken conclusion if a certain rule of inference were followed, will help to determine 




E. S, Pjbarson 147 

the lines of further experimental work and the degree of confidence with which we proceed 
provisionally to adopt a new technique. 

24. An experiment falling under this head has the advantage that the random process 
introduced is under complete control. The analysis will give an answer in probabihty terms 
whether the N individuals have been randomly selected from a larger whole or not. But this 
answer is limited in the sense that it relates only to the N; if we wish to draw conclusions 
about a wider population or populations, then a random selection of the N or, separately, 
of both its parts m and n is needed. Thus we come to Problems II and III. 

26. Approximation to the hypergeometric terms. When dealing with small numbers, the 
calculation of the tail terms of the series may not be laborious, but it soon becomes so when 
r is large. An obvious approximation is that obtained by using an integral under the normal 
curve with the mean and standard deviation of equations (3) and (4) to represent the sum 
of the hypergeometric terms. As usual when approximating to the sum of the terms for 
X = a, a+1, a + 2, ..., etc., of a discrete probability distribution by the integral under a 
continuous curve, we take this integral from the point x ~ a — Thus Pig. 3 shows the 
normal curve i 

Pi^) = ;^ ' (27r)o- [- «)>!]. (6) 

with a and cr^ as in equations (6), and the approximation to the sum of the hypergeometric 
terms for a = 6 and 7 is e<o 

p{x) dx, 

J 5.6 

represented by the area marked with cross-hatching. The approximations for different 
levels are shown in Table 2, and are seen in this case to be quite adequate for the purpose of 
the test. Further comparisons are made in the Appendix, and it appears that provided m 
and n are fairly nearly equal, as they are likely to be in most planned experiments of the 
Problem I type, the normal approximation is surprisingly good. Yates (1934) has suggested 
a method of further correction. 

26. The correction for continuity. In the 2x2 table connexion, the improvement obtained 
by taking the normal integral (i) from a; = a — ^ if a > a or (ii) from a: = cH- ^ if a < a (so that 
we are summing for the lower tail), was pointed out by Yates (1934) and has often been 
termed ‘Yates’s correction for continuity’. It is, however, the natural adjustment to make 
on the basis of the Euler-Maclaurin theorem, when approximating to a sum of ordinates 
by an integral and without wishing to detract from the value of Yates’s suggestion in this 
particular problem, it should be pointed out that the adjustment was used by statisticians 
well before 1934, when employing a normal or skew curve to give the, sum of terms of a 
binomial or hypergeometric series. 


(v) Problem II 

27. This may be described as the test of whether the proportion of individuals bearing a 
character A is the same in two different populations, from each of which a random sample 
has been drawn, i.e. the test of the hypothesis that 

Pi{^) = Pii^) = P> (V 

* The method was in use in the Department of Applied Statistics when I joined the staff in 1921, 
and may have been current many years before that. 


10-2 



148 


Choice of statistical tests 

where p is some common but unspecified proportion. Barnard describes this as the case of 
the 2x2 comparative trial. Here m individuals have been drawn at random from the first 
population and n from the second, and it is found that a/m and bln, respectively, bear the 
character A . The conditions are assumed to be such that if the random procedure of selection 
were repeated, the appropriate probability distributions for a and b would be given by the 
terms of binomial expansions. Table 3 shows the observed results. 


Table 3 



No, with 
character A 

No. 

without A 

Total 

1st sample 

a 

c 

m 

2nd sample 

h 

d 

n 

Total 

T 

s 

N 





Fig. 3. The curves ABO and A'B'O' represent the significance contours i, and i', respectively. 


In tins problem there have been two applications of a random selection process, not one 
as tor Froblem I, and the experimental probability set consists of the (m+ 1) (n+ 1) alter- 
na ive va ues o the doublet (a, 6) (0 ^ a < m, 0 ^ 6 ^ n) which can be represented in the lattice 
lagram s own in Mg. 3 for the special case m = 12, n = 8. It might, of course, be argued 
K ^ T ®fical repetition of the selection process m andn need not remain constant, 

ut 18 , 1 think, would introduce an unnecessary complication into the probability set-up. 




E. S. Peabson 


149 


28. The question before us is whether the result (a, b) is consistent with the hypothesis 
Eq defined in equation (7) above, or whether it suggests that either Pi>P 2 
A little reflexion shows that we have no reason to reject if the point (a, h) Hes near the 
diagonal line on which ajm = bjn, but, broadly speaking, are more and more likely to do so 
the farther the point falls from this line in the direction of the corners (0, n) and (m, 0) of the 
lattice diagram. This statement requires amplification. In defining the significance contours 
we may consider the following question: If is not true, what departures from equality 
in Pi and we regard it of equal importance to detect? Should the power of the test be 

roughly the same for constant values, for example, of 


(a) Pi-P 2 , (b) PiIp^ or (c) 


-pjl- 


-Pi 


The procedure which I have adopted in the sections which follow is frankly one of ex- 
pediency. I have not considered in detail how to choose a family of significance contours 
satisf 5 dng requirements formulated in advance, but have taken those suggested by the 
customary large-sample procedure which gives contours of the form ABO, A'B'O' drawn 
in Fig. 3. These will, I believe, make the power of the test to detect a difference more nearly 
dependent on the ratio of the odds given by (c) than on either of the expressions (a) or (b). 
E. B. Wilson (1941) chooses the expression (a). This point, however, needs further investiga- 
tion. It should be noted that a similar problem, in the case where the sampling distribu- 
tions follow the Poisson law, was discussed very fully by Przyborowski & Wilenski (1939). 

29. Besides involving a 2-dimensional instead of a 1 -dimensional experimental pro- 
bability set. Problem II differs from Problem I in that we need an answer which is indepen- 
dent of the unknown common probability p of the null hypothesis. In Problem I the part 
ofp was played by the fraction r/N given by the data. We are concerned now with what 
Neyman and I (Neyman & Pearson, 1933) have termed a composite hypothesis, and were 
it possible would like the contour levels to bound regions which are ‘ similar to the sample 
space with regard to the parameter p’ (loc. cit. p. 313) (i.e. are independent oi p). The 
following considerations show the lines along which a first attack of the problem can proceed. 

30. If Ha is true and equation (7) holds, then the probability of the observed result may 


be written* 


P2{a\p,m}xPa{b\p,n}=^p^{l-pYx-^^p\l-pf 


(8-1) 


N\ 


p’'{i —pY X 


m\'n\r\s\ 


( 8 - 2 ) 

(8-3) 


'■ " a\b\c\d\N\ 

= P^{r I p, iV} X Pi{a \ N, r, m}. 

Thus the probability of obtaining the doublet (a, b) in sampling from two populations with 
a common p may be regarded as the product of two terms : 

(i) The probability that a -1- 6 = r or that the point (a, 6) in Fig. 3 falls on a diagonal line 
on which r = constant. This probability, P^{r [p.W}, is the (r-l- l)th term in the expansion 
of the binomial 


((l-p)+p)^. 

(ii) The relative probability, given r, of the observed partition into a and b — r — a] this 
is independent of p and is identical with the expression Pi{a \ N, r, m} of equation (1), i.e. is 
proportional to a term of the hypergeometric series (2). 

* It will be seen that } has been used to denote a hypergeometric probability and Pa( } a 
binomial probability. 



150 Choice of statistical tests 

31. If, now, it were possible to draw a boundary line such as ABO shown in Fig. 3, 
cutting off at the end of each diagonal, r = constant, a group of points {a, r - a) such that 

j;:,{Pi{a\N,r,m}'\ = e, (9) 

a 

where e is a fraction between 0 and 1 chosen at will, then the requirement of Step 3 would 
be satisfied. For in rejecting when {a, b) fall beyond this boundary,* the chance of doing 
so if ifo were true would be 

2 [Pfr I p, X e] = e X S iP'fr I = «» (10) 

r-O r=0 

i.e, would be independent of the unknown common p of the hypothesis tested. The test 
would then be analogous to ‘Student’s’ test for the significance of the difference between 
two means, where we have a system of contour levels each associated with a chance e, 
independent of the values of any unknown parameters which are irrelevant to the com- 
posite hypothesis tested. 

32. Unfortunately, this objective cannot be achieved because we are not dealing with 
continuous probability distributions and Pfa | N, r, m} exists only at discrete, integral 
values of a. If we follow the present line of approach, all that is possible is to take contour 
or significance levels which out off from an end of each diagonal, r = constant, a group of 
points for which 

S lPi{a I N, r, m}} = /?, < e. (11) 

a 

Then, in rejecting Hq when (a, b) falls beyond such a contour, we know that the chance of 
doing so, if flg is true, will be 

i[P^{r\p,N}xf,]^e. (12) 

r-0 

It is clear that the amount by which the probability falls below e will be a function of p, 
and that in taking Step 3 we are only associating with each significance level an upper 
limit, e, to the probability of rejecting Hq when it is true. 

33. We have still, of course, to determine the most appropriate system of significance 
levels and to set out a ready means of finding an upper limit, e, associated with the level on 
which an observed doublet (a, 6) falls. f Mr Barnard has broken new ground in 

(i) defining for this Problem II one systematic method of determining a family of levels 
Lg based on certain clearly defined principles; 

(ii) determining the true upper bound to the associated probability e which, in the case 
of small samples at any rate, may be considerably below that which has hitherto been used. 

Since, however, much tabling is heeded before his theoretical advance can be followed 
by a practical working rule available for samples of any sizes, m and n, I think it is worth 
while describing the cruder handling of the lattice diagram which I had discussed in 1938-9 


* Thera would be a similar series of boundaries, L', below the diagonal aim = bln, such as A'B'C' 
of Fig. 3. 

t The likelihood ratio A might be used in determining the family of significance contours, as was 
suggested in connexion with the general x" problem (Neyman & Pearson, 19286, p. 283). In large 
samples A would approximately equal e ~^ , where u is given by equation (22) below. 



E. S. Pearson 


151 


lectures. This involves, perhaps, not much more than a restatement of what may be termed 
the classical approach to Problem II (see paras. 43 and 44 below), but it does bring out the 
difference between Problems I and II, which I think important. 

34. It may bb well to emphasize here that this distinction between the handling of 
Problems I and II is not universally accepted. Fisher has set out his approach as follows in 
a paper read before the Royal Statistical Society ( 1 935) : ‘ To the many methods of treatment 
hitherto suggested for the 2x2 table the concept of ancillary information suggests this new 
one. Let us blot out the contents of the table, leaving only the marginal frequencies. If it 
be admitted that these marginal frequencies by themselves supply no information on the 
point at issue, namely, as to the proportionality of the frequencies in the body of the table, 
we may recognize the information they supply as wholly ancillary; and therefore recognize 
that we are concerned only with the relative probabilities of occurrence of the different ways 
in which the table can be filled in, subject to these marginal frequencies.’ 

This view has also beqn supported by Yates (1934). As I understand it, Fisher would refer 
the observation (os, b) to a linear set (as in my Problem I), however the data have been 
collected; this attitude follows readily if we discard the requirement that the probability 
distribution used in the test must be related to the frequency distribution that would be 
generated by repeated application of the random sampling process employed in the experi- 
ment. It will be seen that with Fisher’s approach there is a gain in simplicity in handling 
the analysis ; it must remain a matter of opinion whether there is a loss in the relevance 
of the probability construct to the question at issue. It is, of course, only when handling 
small samples or in cases where (a, 6) lies close to one of the corners (0,0) or (m,n) of the 
lattice that this need for choice between probability constructs is thrust upon us. 


(vi) Solution of Problem II, using the normal approximation 

35. If the samples are large, the calculation of hypergeometric terms becomes laborious 
and we turn naturally, as in so many other statistical problems, to the approximation using 
the normal curve. In fact, except when r or s are very small or m and 7i very different in 
magnitude, the normal curve with mean and standard deviation given by equations (3) 
and (4) provides a surprisingly good approximation to the relative probability distribution 
of a for fixed r, viz, P^{a | N, r, m} (see Appendix). Define as the deviate of the standardized 
normal curve for which 


‘-Lwnr'^^" 

Then we can draw across the lattice diagram a significance level above and another L' 
below* the diagonal a/m = bjn such that 
(i) all points (a, b) for which 


.. , 

he beyond, i.e. above, L^; 

(ii) and all points (a, b) for which 


(14) 


lie beyond, i.e. below, L'. 






(15) 


* The words ‘above’ and ‘below’ are used in the sense of Figs. 3 and 4. 



j 52 Choice of statistical tests 

If we wish to take special action either when a/m is significantly less than bju or significantly 
greater, then we shall use both levels L, and L'^-, if only, however, when a/m < b/n, then we use 
L^. The corresponding probability levels would be obtained by making e for the second case 
twice its value for the first. Fig. 4 shows the 247 relative probabilities Pi{a | N, r, m} for the 
case w = 18, «. = 12. The unbroken, stepped lines are two contour levels determined in this 
way. Purely for convenience in drawing, the level with e = 0‘05 and ■Wq.os = 1-6445 has been 
put above the diagonal and that with e = 0-01 and = 2-3263 below. 

36. If the normal approximation to the hypergeometric series were correct, it would 
follow that along every diagonal, r = constant, the sum of the relative probabilities for 
points above would satisfy the inequality (11). Hence the inequality ( 1 2) for the complete 
area of the lattice above would hold, whatever the value of the common p. A similar result 
would hold for the area below L' . Of course, the normal approximation will not hold pre- 
cisely, particularly when r or s are small, but here we shall generally be on the safe side, in 
the sense that the hypergeometric distribution is flat-topped with -abrupt ends so that the 
Pf of equation (11) will be considerably less than e, and often zero. 

37. It is interesting to examine the results set out in Fig. 4 with the help of the detailed 

calculations given in Table 4. Columns (2) and (3) give, for constant r, the mean and standard 
deviation of Pi{a | 30, r, 18}, while columns (4) (for io-os) (®) -^o oi) the cut-off 

points defined by the normal approximation, i.e. 

Oj == a-^-% 05 Xcr,, and ag = a ^ -t- x o-„. (16) 

The sums of the relative probabilities Pjfa | 30, r, 18} for a^a^ and a^a^ are given in 
cols. (6) and (9) respectively. Thus, for example, for r = 7 

= 4-2 -0-6- 1-6449 X M643 = 1-80, 

and the sum of the probabilities for a = 0 and 1 is 

0-0004-1-0-0082 = 0-0086. 

These are the tail sums, termed in equation (11). It is clear from an examination of 
cols. (6) and (9) that they are all less, and many of them very much less than 0-06 and 0-01. 
This is inevitable with a discrete distribution containing few terms. The contour levels have 
been drawn conventionally in Pig. 4 as steps passing through the half-integer points and 
not. through the cut-off points of cols. (4) and (8). Clearly, whichever way they are drawn, 
they will separate off the same subset of the (m-{- 1) (%-|- 1) points in the lattice diagram 

38. The next question is this. If we were to use either of these levels, what in fact would 
be the chance of the sample doublet (a, b) falling beyond, if the null hypothesis were true? 
This will depend on the common value of p. The product sums 

[Efr \p,N]x [^r(l -pY X (?.] (17) 

obtained by multiplying the expressions in cols. (5) and (9) of Table 4 by the appropriate 
binomial terms are shown for a variety of values ofp in Table 6, cols. (2) and (3). It is clear 
at once how far on the safe side we are in saying that these chances are < 0-05 and 0-01 
respectively. Similar calculations were carried out for a second example, taking m — n = 10, 



E. S. Peabsok 


153 



Fig. 4. Hypergeometrio probabilities in lattice diagram for m= 18, n— 12. 


I 

a 




d 

o 

.£3 

•4= 


0 

1 


3 i 




Hn Hw 




> 


a> 

a 


-tS 

cd 

f 


ns 

0^ 

XJ 


9 S 





^ rH>— li-Hi-SrHf-HrHf— <>— li— 


C 0 » 0 '^C^<N<O 10 rH 00 Vf 5 Ot>Or-(rH^ 0 C 0 CD«Or-^ 

»Ot'COOO'^ 05 C 3 SCCTHCOO^OO>i< 05 C<IOOlO»OX 

i-« 00 (N'-^ 0 '-tOOi^Or-lO<MOOOCSlO'-i 

OOOOOOOOQOOOOOOOCpOOO 

ooooooooooooooooooooooooooooooo 


ry»ococc>oi>xu 5 ooqoxco> 05 occ>«o 
cc<- 40 '^c^o«<--te^c^weio<Nx»~<ic 
r ooooooooooooooooo 

I ooooooooooooooooo 

ooooooooooooooooooooooooooooooo 


^ r~(^XX»OXCOrHri;«<IXr-Ht-«OC^OXXOOC^VirHO'-^ 

C ioiOF-tco(Mco<MOCM»fto^>«v:»FHco«ocqTtii>f-HXco^ 

w »-» o o O O o O O O o 9 O O O o o o o o o o o o o ^ 
0066666666066666666606066600000 


f-<Xiflt'C 0 OOrH'rii®^ 6 xOOOXO'«*<X‘^'HOO«t>i 0 «OO 
^ aqO-^CpWOpCOOOXOp'^OlO^t^XOO'NXlfiNOOrHOCOOift© 

£' 06666 pHrH<^i^c 5 t 6 M< 6666 i>i>x 666 ^A<N 6'^'^6 66 

I I— 


<— lOOOt'i-Hi— iif 5 O<N 05 i-Hi 0 C 0 X 
^ «w^\OX€ 00 "«J<XOi-HTl<OOOXOOOC'V«rH 

S '-« 0 «N 0 <N 0 C^ 0 r-t-^i-<C^ 0 rHW»-(eqTj< 0 »-tC^ 

i' ooooooooooooooooooooo 

0000666666666666666666666000000 


0 <-^<Ni 0 l>C 0 OO'-H^X'^OxO 5 DC 0 ®O'«i<X'«! 4 <'^OOff 0 I>» 0 XOO 

|• 9 l^'^ 09 cococlo«op 999 oo(^^cp'^|--|^-wOl>T^^r^o>lA^^^^o 9 

«'666666rH»Mw66w'^6666i>xx666Ac^G^ffJs^66t^ 

llll 


00000 '^f^**^i^rHrHf— li—lpH 


pH f— < iH rH O O O ^ 


^ OGqOp'^OOWCO'^jOONX'^OCONXTHO CD*<N XrHOOff^Qp-^O 
2 -<^o»>HA« 66 '^'^ 666 i>r^x 6666 A(NM 664 < 6666 i ^6 

rHi-Hi-Hi-HrHi-HpHrHrHPHrHrHrHi-H 


^OrH<NC«5Ti»»OCOI>XC550'-HC<JM-^W5<Ot>XOOpHWM'^iOCCilr*Xa50 










E. S. Pbabson 


155 


and the results are shown in Table 6, cols-. (6) and (7). In this case, the actual chances of 
[a, b) falling on or beyond the significance levels are even further below the nominal limits 
of 0‘05 and O'Ol. In fact, it becomes clear that in the case of small samples, at any rate, this 
method of introducing the normal approximation gives such an overestimate of the true 
chances of falling beyond a contour as to be almost valueless. 

Table 6. Showing the, difference between nominal and actual significance levels 



1st example : m = 18, w = 

12 

2nd example : m — 10 = 

n 


p 

Method 1 

Method 2 

Method 1 

Method 2 

p 









(ifffo 

true) 

True chance of 

True chance of 

True chance of 

True chance of 

(if Ho 
true) 

falling on or 

falling on or 

falling 

on or 

falling 

on or 


beyond 

beyond 

beyond 

beyond 



i'O'OS 

L'o-Ol 

•^0-06 


-^0-05 


i^0.05 



( 1 ) 

(2) 

(3) 

(^) 

(6) 

(6) 

(7) 

(8) 

(9) 

(10) 

O’OS 

O'OOlO 

0-0000 


0-0000 






msm 

0'0064 

0-0000 


0-0003 



0-0261 

0-0006 

0-1 

0-2 

0-0141 

0-0003 


0-0043 

0-0037 




0-2 



0-0012 

0-0490 

0-0091 


0-0014 

0-0496 


0-3 

0-4 


0-0023 



0-0062 




0-4 








0-0672 






0-0437 

0-0119. 


Repeat as for 1 — » 






0-0431 

0-0120 





0-7 















0-0062 





0-9 

0-96 

99 


Hill 

0-0010 







39. Before considering a second method, it will be useful to recapitulate certain character- 
istics of what I have termed Method 1 . It provides for any nominal value of e one systematic 
procedure of defining a critical boundary, or significance level cutting off a region from the 
lattice diagram. Neither the subgroup of points cut off, nor the sum of the probabilities 
associated with them for a givenp, will alter continuously with e; they will change by discrete 
steps as the cut-off point, defined in para. 37, passes through a point (a, b). While we shall 
sometimes want to know whether the observed (a, b) falls beyond a level specified in 
advance, more often we shall ask what is the level on which [a, b) falls. This, using Method 1, 
we find by calculating 

• (O’ + !•) 


u = 




i£ a <a or u = 


a — - 


if a>a, 


(18) 


and finding e from the noriiial integral of equation (13). In this way the nominal chance e 
wiU be a little nearer the true upper limit than the figures in Table 5 suggest,* but not enough 
to modify the criticism expressed above. 


' * It will fie seen from Table 4 that no point (a, b) gives a in cols. (6) and (9) of exactly 0-06 or O-Ol, 
respectively, so that no points actually lie on or L^i. 
























Choice of statistical tests 

40. Method 2.. The introduction of the correction of for continuity is certainly ap- 
propriate in using the nornaal approximation to the hypergeometric series in Problem I, 
but I think it is not helpful in Problem 11 where we are concerned with a 2-dimensional 
experimental probability set. If instead of obtaining significance levels and as in 
paras. 35-37, we obtain them from inequalities similar to (14) and (15) but with the correc- 
tion of I omitted, then there are several points to be noted : 

(a) Por the significance level L^, the expression 

/3r=ZiPi{a\N,r,m}l (19) 

a 

where the summation is for values of a on the diagonal, r = constant, for which 


= a— (20) 

will be sometimes less and sometimes greater than e. Hence, in the balance, it seems likely 
that the chance of the point {a, b) lying beyond or 

will lie closer to e than when the ^ correction is used. The position will be the same for L'. 

(6) In drawing repeated samples of m and n from two populations in which there is a 
common chance, p, of an individual possessing character A, the ratio 


u = 


a — a a — rmlN 


J 


mnrs 

iVW^T) 


( 22 ) 


has, whatever be p, (i) an expectation of zero, (ii) a unit standard deviation.* The shape of 
the distribution will, of course, depend on p, hut, faut de mieux, we may not in the long run 
do too badly by assuming it to be normal. It is, of course, the weighted combination of a 
number of hypergeometric series whose shape depends on r. 


41. Consider the result of applying this Method 2 to the case m = 18, n = 12 already 
discussed. The procedure for determining the 0-05 and 0-01 significance levels will be exactly 
as under Method 1, except that the continuity’^ correction of ^ is omitted. The resulting levels 
are shown as dashed, stepped lines in Fig. 4.‘j- They fall, on the whole, inside the significance 
levels obtained by Method 1 . Now turn to Table 4, where cols. (6) and (10) show the cut-off 
points a halfunit further in towards the diagonal a/w = bjn. Cols. (7) and(ll)give the values 
of/?,.; some of these are considerably above the nominal values of e = 0-05 and 0-01, others 
are still well below. But from the approach to Problem II that has been adopted, this is 
immaterial since the experimental probabihty set is the 2 -dimensioned one of the lattice 
diagram and is not restricted to the diagonal r ~ constant on which the observed point 
(a, b) may happen to lie. What we are concerned with is the summed chance given by expres- 
sion (21) and the value of this is given for eleven values of p in cols. (4) and (5) of Table 6. It 
will be seen that this true chance does sometimes exceed the nominal values of 0-05 and O-Ol, 


Provided cases where r or s are zero, making the expression (22) indeterminate with u = 0l0, 
Me excluded. Mr Barnard has pointed out that one way of avoiding this exclusion would be to lay 
own that, when u = 0/0, we assign, to the ratio a value chosen at random from a population (say 
normal) with zero mean and unit variance. 

t Again, for convenience the 6 % level is drawn aboye. and the 1 % level below the diagonal. 



E. S. Peabson 


157 


but never by very much. Again, for the second example with m = 10 = % (Table 6, cols. 
(8) and (9)) the true chance, while it sometimes exceeds the nominal value, is always con- 
siderably nearer it than using the significance levels of Method 1. 

42. It is clear that no final conclusions can be based on two numerical examples, but it 
seems that the test of the null hypothesis in Problem II should be carried out as follows; 

(u) When w, n,r ox s are small, with the help of tables prepared on Barnard’s lines, based 
on an ordered classification of the points in the lattice diagram, and giving the true upper 
bound of the chance that a point {a, b) falls on or beyond the level on which the observed 
result lies. The particular basis of his classification may, of course, be modified. 

(6) When m, n, r and s are large, by assuming that the u of equation (22) is a normal 
deviate with unit standard deviation. 


(vii) The classical approach to Problem II 
43. It has recently become customary to regard the test of significance applied to data 
given in a 2 X 2 table as the limiting case of a test with one degree of freedom. But Problem 
II was originally answered in somewhat dificrent terms. It was noted that if 

Pt{A) ^ p^(A) = p, (23) 

then the fractions a/m and bjn would both have expectations oip and variances of p(l — ■p)/m 
andp(l —p)ln, respectively. Hence, if the null hypothesis were true, the difference 


^ _ 6 
in n 


would have 


mean d - 0 


<^d 


p(l- 


- 2 ’) (- + -)] [' 

\m n/J I 


In large samples, therefore, it might be expected that 

d aim — b/n 


(24) 


(26) 


(26) 


+!/«)] 

would be approximately normally distributed. Since by the nature of the problem the 
common value of p was unknown, an estimate was made from the sample, namely, 

^ _ a + b _ r 
N' 


^ m + n 

Substituting this info equation (26), we have 

d _ al m — bjn 


(27) 


^[{rlN){l-rlN){llm+ljn)] 
a — rmjN 
mnrs\ ' 


I 

44. The form (28-2) is easily derived from (28-1), if we remember that b 


(28-1) 

(28*2) 

-a, s = N ~r 


and m f-Ti = W.* It is seen that the ratio dfSfi is identical with the ratio u of equation (22), 
except for a factor ^[{N-1)IN] which is unimportant in large samples. Thus the classical 
test is practically identical with that suggested in paras. 40-42 above, though the two tests 
are differently derived. 

* A third alternative form is, of course, [ad — be) N f ,J(mnrs). 



158 


Choice of statistical tests 

(viii) Pkoblem III 

45, This may be described as the test for the independence of two characters A and B. 
It is supposed that the probability that an individual _seleoted at random will possess 
character A is p{^) and that he will not possess it is = 1 -p(4). The corresponding 
probabilities for character B are p(J5) andp(5) = l-p(-B)- I'our alternative_combinations 
of the characters may occur, which may be denoted by AB, AB, AB and AB. The various 
probabilities are set out in Table 6 A. If the null hypothesis, H^, specifying the independence 
of A and B is true, then 

p(AB) = p{A) X p(B), p{AB) =p{A)p{B), etc. (29) 

To test the hypothesis, we have a random sample of N observations with frequencies of 
occurrence of the combinations A P, AB, etc., which may be classified in the 2x2 scheme of 
Table 6B. The sampling conditions are such that the probabilities of Table 6 A are the same 
for all individuals selected, or, in conventional terms, the sample is drawn from an infinite 
population, Barnard calls this problem that of the double dichotomy. 

Table 6 A. Probabilities Table 6 B. Sample data 



A 

A 

Total 

B 

a 

0 

m 

B 

b 

d 

D 


■ 

s 

N 



A 

A 

Total 

B 

p(AB) 

p(lB) 

p(B) 

B 

p(AS) 

p(IS) 

piB) 


I— 

p(A) 

1 


46. In Problem III there is only one application of a random process, the selection of N 
individuals, each one of which must fall into one or other of four alternative categories. If 
the random process were repeated and another sample of N drawn, not only are the fre- 
quencies a, b, c and d free to vary, but also both marginal totals, i.e. m may change as well 
as r , The experimental probability set will therefore .contain results {a, b, c, d) restricted by 
the conditions (i) that none of the frequencies can be negative and (ii) that 

" A”, (3b) 

Geometrically, as Barnard points out, the set can be represented in 3 dimensions by points 
at unit intervals within a tetrahedron obtained by placing on top of one another the series 
of 2-dimensioned lattices of dimensions 

Oxn, lx (71-1), 2 X (77-2), ..., (77l-l)xl, 771X0. (31) 

47. We are again testing a composite hypothesis and should like to determine a family of 
critical surfaces to be used as significance levels, dividing the points within the tetrahedron 
in such a way that the chance of the sample point {a,b,c,d)* lying outside a given surface 
I/j is equal to e, whatever the values of the unknown probabilities p{A) and p(B). But 
again, as in Problem II, owing to the discontinuity in the set of points, there are no ‘ similar 

In view of the condition (30), the point can be defined by three co-ordinates, e.g. as (a, b, a), 
(a, b, m) or {a, r, m). In view of the form of equation. (32), the last system of co-ordinates will be used. 

















E. S, Pbabson 


159 


regions’. We note that if Hq is true, the probability of the observed result is a term of the 
multinomial expansion, viz. 


N\ 

a\b\o\d\ 


'p{AY+^ p[By>+’’- 


N\ 
ml nl 


p{Bf^ (1 -p{B)Y x (1 -p{A)y X 


mlnlrlal 

alhlcldlNl 


= AW 1 PiB), N} X P^{r I p{A), N} x P^{a | N,r, to}. (32) 

Here, the notation of para. 30 has been repeated. 

48. Thus the probability of obtaining a sample represented by the triplet (a, r, to) may be 
regarded, if the characters A and B are independent, as the product of three terms: 

(i) The probability of drawing m individuals with character 5 in a random sample of N, 
i.e. the probability that (a,r,m) falls in a horizontal section of the tetrahedron on which 
TO = constant. This is the (m + 1 )th term in the expansion of the binomial 

{(l-p(H))+p(H)}^. 

(ii) The probability of drawing r individuals with character H in a random sample of N, 
i.e. the probabihty that (a, r, to) falls on the vertical section of the tetrahedron on which 
r = constant. This is the (r + l)th term in the expansion of 

{(l-p(H))+p{H)}". 

(iii) The probabihty, given to and r, of the observed partition within the 2x2 table. This 
term represents the relative probability associated with the points lying along a straight hne 
TO = constant, r = constant; it is, of course, the same expression as has arisen in Problems 
I and II and is proportional to a term in the hypergeometric series F(-r, — to, m-r+1,1). 

49. We are faced with a situation similar to that met under Problem II. Were it possible 
to cut off from each line on which to = constant, r = constant, a group of points such that 

S[Pi{a|iV',r,TO}] = e, (33) 

a 

then the subset of points within the tetrahedron composed of the sum of these groups for 
all possible combinations of to and r would have the property required of a ‘critical region’ 
in a significance test: i.e. the chance that the point (a,r, to) is included in the region, if 
is true, would be e whatever values the irrelevant probabilities p{A) and p(B) assumed. 
However, (33) cannot be satisfied in general, and all that is possible is to define a family of 
significance contours such that the chance of a sample point falling beyond any one of them, 
say Lg, is < e. By using the normal approximation to the sum of the h3rpergeometric tail- 
terms with the correction for continuity as described in paras. 35-39 for Problem II, we shall 
be very much on the safe side, i.e. the formal level of e is likely to be much above the true 
chance of falling beyond the level, whatever bep(H) ovp{B). The presence of the two binomial 
terms in equation (32) instead of the single term in equation (8-3), makes it likely that the 
overestimation of e will be greater in Problem III than in II. It is to be expected, therefore, 
that any any rate when neither to, n,roTS are too small, the better approximation will be 
obtained by referring the u of equation (22) to the normal probability scale. 

50. The handling of Problem III is discussed briefly by Barnard on p. 136 above. 
There is clearly room for further investigation. The general nature of the approximation 



J0O Choice of statistical tests 

involved is of course that which arises in every test for goodness of fit or for independence 
in Bnhxk table, where we replace a distribution consisting of, a finite set of probabilities at 
discrete points in multiple space by a continuous distribution for which integration outside 
ellipsoidal contours is straightforward. 

(ix) GENEBAIi COMMENT 

61 . The duties of the statistician lie at many levels. He may be required merely to apply 
an established technique of analysis to an assembly of numerical data and this application 
may result in a statement, based on probability theory, of a ‘level of significance’ or a 
‘confidence interval’, which will be used by others. Or he may be called on to share in 
planning the investigation or experiment which is to provide the data and then to draw 
conclusions from their analysis which will lead to further action. In this final role he needs 
to bring into play faculties which are no monopoly of his calling, the qualities of sound 
judgement which are the characteristics of a well trained, scientific mind. In the weighing 
of evidence, the result of the statistical analysis, expressed in one or more conventional 
probability figures, is only one factor in the summing up; as important, may be, is the 
question of whether the mathematical model is a fair counterpart to the happenings in the 
observational field. In addition, there will often be much infoi-mation coming from outside 
the range of the immediate investigation, yet hardly expressible in numerical terms, which 
must influence decision. 

52. It is perhaps hard experience gained in certain fields of war-time research, where 
decisions had to be reached on statistical data far less ample than could be wished, which has 
forced my own attention to this question: What weight do we actually give to the precise 
value of a probability measure when reaching decisions of first importance ? One subject for 
examination falling under this inquiry is clearly the logical basis of the reasoning process 
by which judgement is influenced as a result of the application of a test of significance. This 
was the theme on which this paper opened. The approach illustrated in the pages which 
followed is a personal one and is set down, with no claim to be the best, in order to provoke 
thought and discussion. There appears no short route to a right answer in this matter; each 
individual who hopes to use his own judgement to the full in drawing conclusions from the 
statistical analysis of sampling data, must decide for himself what he requires of probability 
theory. 

63. In the approach which I have followed and illustrated on the analysis of data classed 
in a 2 X 2 table, the appropriate probability set-up is defined by the nature of the random 
process actually used in the collection of the data, Consideration of this point forms the 
initial step in the determination of the appropriate test. On this score, what I have termed 
Problems I, II and HI are differentiated. The difference is fundamental and lies at the 
bottom of the dilemma to which the Barnard-Fisher correspondence in Nature drew atten- 
tion. It can he illustrated on the following data, given in Table 7, where I shall suppose that 
the effect we are interested in is that making a significantly greater than h. 

54. If [a) the results have been obtained by random assignment of Treatment 1 to 
eighteen out of thirty individuals and Treatment 2 to the remaining twelve, and 

(6) we merely ask whether the results are consistent with the hypothesis that the treat- 
ments are equivalent as far as these thirty individuals are concerned, so that the difference 
between the proportions 15/18 and 5/12 may reasonably be ascribed to a chance fluctuation, 



E. S. Pearson 


161 


(c) we are then concerned with Problem I, i.e. simply with the probabilities associated 
with the points (a, 20 - os) on the diagonal r = 20 of Fig. 4. The chance of getting a > 15, if 
the null hypothesis is true, is 0*0241,* or, using a common phrase, we can speak of the result 
being significant at the 2*5 % level. 

56. On the other hand, if a sample of 18 has been drawn randomly from one population 
and a sample of 12 independently from a second and we wish to test whether Pi{A) = i?a(^)> 
then it seems to be an artificial procedure to restrict the experimental probability set to 
the 11 points on the liner = 20, i.e. to the values of a: 8, 9, ..., 18. A repetition of the double 
sampling process could give us a result {a, b) falling at any of the 19 x 13 = 247 points in 
the lattice diagram of Fig. 4. There will be a number of ways of defining a family of signi- 
ficance levels for this 2-dimensioned set; if we adopt that discussed in paras. 40-41, which 


Table 7 


For problem I 

For problem 11 

Frequency of results 

Total 

A 

A 

Ist treatment 

2nd treatment 

Sample from Ist population 
Sample from 2nd population 

a = 16 
6=6 

_ 

0 = 3 
d = 7 

m = 18 

M.= 12 

Total 



27 = 30 


gives as two of its members the dotted, stepped, lines shown in Fig. 4, we can say that the 
chance of a result falling beyond the lower line is certainly less than 0*016. f The observed 
point, with a = 16, 6 = 5 falls beyond the line, so that the result is undoubtedly ‘signifioant 
at the 1*6 % level’. 

56. These two probabilities, 2*5 and 1*5 %, are not the same, but there is no inconsistency 
in their difference . The character of the two investigations is different and to treat Problem II 
as though it were Problem I seems to call for a probability set-up which is unnecessarily 
artificial, when a simpler one is available. Admittedly by getting what seems to me a closer 
relation between the probability set-up and the experimental procedure, we have sacrificed 
some simplicity in handling the 2x2 table. But this is only the case when dealing with 
small numbers. For large numbers the methods of handling Problems I, II and III become, ■ 
practically, identical. 

57. Consider again the heavy shell problem described in para. 7 above. If we are to intro- 
duce probability theory, it seems to me that we should regard the problem as one in which 
we have a sample of m = 12 from the possible output of shell made to one design or by one 
firm and of ti = 8 from the possible output of a second. This sampling may be hypothetical 
in that these may be ‘pilot’ shell, the first off production; nevertheless, this construct is 

* For the normal curve approximation, using the correction for continuity, we find 

M= (16- i~12*0)/l*2866= 1*94:3. 

The proportionate area under the normal curve beyond this deviation is 0*026. 

t Table 6, col. (6) shows the largest value of this chance to be 0*0120 for p = 0*3. This figure cannot 
be much exceeded for other p’s though I have not determined the precise maximum. I give 0*015 as 
a safe-side limit. 

Biometrika 34 


II 









262 Choice of statistical tests 

clearly less artificial than one in which, on the null hypothesis, we regard the experiment as 
though it were made on twenty shells, to twelve of which has been randomly assigned the 
label ‘Made by firm X ’ and to the other eight, ‘ Made by firm Y’. 

58, It is clear that in the heavy shell problem there may be many reasons to doubt 
whether the rounds fired can be regarded as a random sample from future output. That is 
why I have emphasized that the exploration which the statistician makes in private will not 
necessarily be presented in figures at the conference table. In this example, the proportions 
of successful perforations were 2/12 and 5/8; these put us on the line, r = 7, of the lattice 
diagram for which the hypergeometric probabilities were shown in Pig. 2. The sum of the 
terms with a^2 is 5-2% (normal approximation, using the ^-correction, 6-6%). This is 
the chance of getting as great or a greater positive difference, 6 — a, if Hq were true, treating 
the case as Problem I. Barnard’s method has not yet been extended to cover this case, 
but if we were to use the large sample method for handling Problem TI, described in my 
paras. 40-41 , we should find from equation (22) that 

m = (2-4-2)/1-072 = -2-06, 
which puts (ffl, b) outside the upper 2-6 % level. 

69. Were the action taken to be decided automatically by the side of the 5 % level on 
which the observation point fell, it is clear that the method of analysis used would here be 
of vital importance. But no responsible statistician, faced with an investigation of this 
character, would follow an automatic probability rule. The result of either approach would 
raise considerable doubts as to whether the performance of the first type of shell was as good 
as that of the second, but without the whole background of the investigation it is impossible 
to say what the statistician’s recommendation as to further action would be. 

60. In the example of the proof of anti-tank shot discussed in para. 6, the chance of 
perforation, p, while varying from plate to plate and batch to batch, will almost certainly 
not range through the whole interval 0-1. The striking- velocity of the shot would also 
probably be adjusted so that for average proof-plate and batches, p was near Then the 
discriminating level (or levels*) set across the 13 x 13 lattice diagram would be fixed paying 
regard to the likely variation in p; thus a fairly close upper limit could be calculated to the 
true probability of (a, 6) falling beyond the level if the fresh batch were of the same quality 
as the standard. This is the upper limit of the risk of segregating the batch wrongly. 

61. Precisely similar problems arise for consideration in even more difficult form in the 
analysis of data arranged in&hxk table, where horkov both are > 2. It has become common 
practice to speak of the solution of this problem in terms of ‘fixed marginal totals ’, but it 
may be questioned whether the restriction in the experimental probability set implied is 
generally appropriate. The frequencies in a hxk table may have been obtained by many 
different sampling procedures for, as in the 2x2 problem, a single form of tabular presentation 
willfollowfrom a variety of types of investigation. For most of these, a repetition of the random 
process of selection would give results with either one or both sets of marginal totals changed. 

62. For convenience in solution we may, of course, start by considering the distribution 
of our test criterion, on the null hypothesis, within the sub-set of results for which the margins 

* It is possible that two levels might be taken with the associated proof rules : (i) if {a, by falls beyond 
the outer one, reject the batch; (ii) if between outer and inner, fire further rounds; (iii) if within the 
inner level, accept the batch. 



E. S. Peabson 


163 


are fixed. If this distribution were the same whatever these fixed values, then the overall, 
distribution for unrestricted sampling would be the same as that for variation subject to 
fixed margins. Thus, mathematically, the solution of the partial problem would be a step in 
the solution of the complete one. But when applying analysis to an A x /b table, this result 
is only true as a large-sample approximation. 

63. If we use the mathematical model which it is suggested gives the most direct aid in 
reasoning from the observations, i.e. that which regards the experimental probability set 
as generated by a repetition of the random process of selection used in. collecting the data, 
then in the majority of cases we cannot regard the marginal totals as fixed. Thus a rigorous 
treatment would lead, as in the case of the 2x2 table, to a differentiation into a number of 
solutions. It is to be hoped, however,* unless the numbers in the margins are very small, 
that the approximation with its appropriate degrees of freedomf will give results which 
are not misleading. This approximation leads, of course, in the 2x2 table to the reference 
of the ratio u of equation (22) to the normal probability scale. Some aspects of the approxi- 
mation in this more general case were discussed by Yates (1934, pp. 233-35). 

64. In closing I should like again to acknowledge my indebtedness to Mr G. A. Barnard. 
Having had the good fortune to discuss these problems with him and see drafts of his work 
over a period of 2 or 3 years it is difficult to say how many of his ideas have been built un- 
consciously into my own earlier approach. But I am especially aware of the clarification 
which his emphasis on the distinction between Problems I, II and III brought to my survey. 
I am also very grateful to Mr M. G. Kendall, Dr R. 0. Geary and Dr B. L. Welch for a number 
of helpful criticisms , and to Mrs Maxine Merrington for her extensive computing work, which 
has alone made possible the various numerical illustrations that I have given. 

* From the point of view both of the exponents of the fixed marginal and unrestricted marginal 
approach. 

t The statement that, for example, in applying the test of independence of two characters to an 
hy.k table, the degrees of freedom are (h—l) x (A— 1), does not of course mean that sampling is re- 
stricted by fixed marginal totals. All that is implied is that approximately the overall distribution of 
the function of the observations used, is the same as that for sampling within the restricted sub-set ; 
this is because the distribution within each sub-set is approximately independent of the particular 
marginal totals which define it. 


REFERENCES 

Babnabd, G. a. (1946o). Nature, Land., 156, 177. 

Babnabd, G. a. (19466). Nature, Land., 156, 783. 

Fisheb, R. a. (1936). J. Boy. Statist. Soc. 98, 39. 

Fisheb, R. a. (1941). Science, 94, 210. 

Fisheb, B. A. (1945a). Nature, -Lond., 156, 388. 

Fisheb, R. A. (19466). Sankhyd, 7, 130. 

Kendael, M. G. (1943). The Advanced Theory of Statistics, 1. London: 
Charles Grifiin and Co. Ltd. 

Neyman, J. & Peabson, E. S. (1928a). Biometrika, 20 A, 196. 

Nbyman, j. & Peabson, E. S. (19286). Biometrika, 20 A, 263. 

Neyman, j. & Peabson, E. S. (1933). Philos. Trans. A, 231, 289. 
Nbyman, J. & Peabson, B. S. (1936). Statist. Bes. Mem. 1, 113. 

Nbyman, J. & Peabson, E. S. (1938). Statist. Bes. Mem. 2, 26. 

Peabson, K. (1899). Phil. Mag. 47, 236. 

PBZYBOBOWSia, J. & WiLBNSHi, H. (1939). Biometrika, 13, 313. 
Wilson, E. B. (1941); Science, 93, 667. 

Wilson, E. B. (1942). Proc. Nat. Acad. Sci., Wash., 28, 94. 

Yates, F. (1934). J. Boy. Statist. Soc. Suppl. 1, 217. 


11-2 



[ 164 ] 


APPENDIX 


THE NOBMAL CURVE APPROXIMATION IN PROBLEM I 


1. The following Tables 8 and 9 (A), (B) and (C) show the order of accuracy which results 
from using the normal curve integral as an approximation to the tail sums in the series 


I N, r, m} = 


m ! ! r ! s ! 

a!6!c!d! N! 


(34) 


the terms of which are proportional to those in the hypergeometric series 

F(-r, 1). 

Here a is a variable which can assume the range of positive, integral values indicated 
under (i), (ii) and (iii) in para. 20 above, while N, r and m are fixed. The relation 
between these quantities and 6, c, d, n and s is given in Table 1, para. 17, The method of 
approximation, using the correction for continuity, has been discussed in para. 26. 

2. Table 8 takes the case of an equal partition, and shows the sum of the 

terms in the expression (34) for which which is also the sum of terms for which 

a^r-aj. For results are given in Table 9 for m>n and for the following pro- 

portionate partitions of N : 

(A) m=fJV, Ji=fV; (B) m=^N, n=^N; (C) m=^N, n=-M- 
Here sums of terms at both tails of the series are needed. The sums (or chances of a'^a^ 
or have not been given for all possible values of but, broadly speaking, for those 
within the limits where significance is likely to be in question. Sums below 0-0010 have 
generally been omitted. In each case the true sura of the terms (34) is compared with the 
approximation from the normal integral. 


3. In drawing conclusions from the comparison, we have to decide what degree of 
accuracy is called for. Clearly the normal integral does not give mathematically exact 
results to 4 decimal places. On the other hand, except for certain instances where the 
partition is very unequal (m = |N and ^N) and r is small, the order of the approximation 
may be said to follow that of the series closely. If decisions are made by rule of thumb, 
according to the side of the 5 % or 1 % significance level on which a falls, then there are 
a number of entries in the tables where the approximation would give a on the wrong side. 
But one may question whether judgement of significance based on a single experiment can 
in fact be made sensitive to a difference between, say, 0-06 and 0-04 (odds of 16 to 1 and 
24 to 1) or between 0-012 and 0-008 (odds of 82 to 1 and 124 to 1) and, given such latitude 
in accuracy, the approximation will be found generally sufficient. These must be points, 
however, where personal opinions will differ. Whatever views are held, the tables are 
sufficiently extensive to make it possible to obtain from them a rough measure of the 
accuracy of approximation in a wide range of cases. 

4. It will he noted that in the symmetrical case (m = ^N) and also when m the 
normal approximation for the tail sum is almost invariably a little too large. Undoubtedly 
for the symmetrical case an improved approximation could be obtained by modifying 
the J correction used in calculating the ratio of deviation to standard deviation. This 
second order term would, however, need to vary with the probability level, thus com- 
plicating the procedure, 



165 


Table 8. Case of qml partUion, m = % = Chance that a^a^= chance that a^r-a^ 































166 


Appendix 


Table 9. Case of unequal partition. Chances that a ^a^ and a > 
(A) m — f AT, n = fiV 




































Appendix 


167 


Table 9 (contmued) 




































[ 168 ] 


2x2 TABLES. A NOTE ON E. S. PEARSON’S PAPER 

By G. a. BARNARD 


As Prof. Pearson has kindly shown me the proof of his paiier, I should like to make the 
following further remarks. 

1. If we have a sample of N from a population in which there is a chance p that an 
individual will have a character A, we can represent it in the form 


^ 3 ' * * * » ' ’ * > 

where is 1 or 0 according as to whether the ith member has A or not.* Regarding the 
a;’s as quantitative variables, we have by classical results the unbiased estimates 

^ = re. = (2X)/N and = (r(x^ -a:. )a)/(A~ 1). 

If r of the a;’s are 1, while s are 0, we find 


■p = rlN and 1). 


Using this unbiased estimate of variance in I*rof. Pearson’s para. 43, we get, instead of his 

d _ a~rmlN . 

s,i~ I mnrs ‘ 

V ) 


agreeing exactly with his (22). 

2. To carry the argument further, in classical theory, if we have two samples 


and -...yj 

to test whether the samples come from the same normal population we take 


X. —y. I mn 
s sj m+n’ 


vfhGXGx.=(Zx^)[m,y . = and 


,a ^ S{xi~x. )‘^ + S(y^-y. 

m + n — '2 ’ 


( 2 ) 


and use tables of the t distribution for {ni+n — 2) degrees of freedom. 

It is common practice to neglect departures from normality in applying this test. If we 
do so, and apply it to our qualitative case along the lines indicated above, we get 


t = 


a — rm(N 
acn+bdm’ 
N{N-2) 


J'- 


which, if we are justified in our neglect of departures from normality, should be distributed 
as t on (N—2) degrees of freedom. 


* For a similar argument see B. L. Welch (1038, p. 165). 



G. A. Baenaed 169 

3. To obtain the formula (1) on these lines, we have in effect to commit the well-known 
fallacy of replacing as given by (2), by 

m + n-1 ’ 

where m' = {Zxi+Syj)j{m + n). 

We are led to aSk why (3) should be approximately correct (and in fact it is better than (2)) 
in the qualitative case, while (2) is preferred in the quantitative case. 

4. The simplest reason for preferring (2) to (3) in the quantitative case is that s'^ is not 
independent of (k. —y.), so that the conditions for validity of the t distribution are not 
satisfied. In our qualitative case this argument loses validity, since neither s® nor s'® is 
independent of (* . - y . ). 

The second reason for preferring (2) to (3) in the quantitative ca«e is more complicated, 
but for our purposes it reduces essentially to the fact that, in the case of normal distributions, 
and only in this case, the mean and variance of samples are independently distributed, so 
that the common mean value of the populations, estimated by m', is irrelevant to the test 
for differences. In our qualitative case, on the other hand, m' contributes to our knowledge 
of the variance. 

6. If we apply Pitman’s ‘absolute’ analogue of the I test to our case, we arrive at the 
hypergeometric series of Prof. Pearson’s Problem I. But Bartlett’s argument, showing the 
convergence of Pitman’s test and the t test, wiU apply here only in very large samples, because 
of the finite probability of obtaining observed values which coincide. 

6. Prom the above point of view, Prof. Pearson’s analysis of his Problem II may be 
regarded in one sense as an examination of the effect of large departures from normality on 
the t test. In this light, his conclusions given in paras. 51 and 52 are seen to extend to the 
t test, as well as to the 2x2 table problem. 

7. If I may state my personal attitude, it is that statistics is a branch of applied 
mathematics, like symboUo logic or hydrodynamics. Examination of foundations is 
desirable, but it must be remembered that undue emphasis on niceties is a disease to which 
persons with mathematical training are specially prone. In pure mathematics itself there are 
disputes on foundations which closely parallel the disputes over the foundations of statistics. 
The lesson to be drawn is, that while statistics is a most valuable aid to judgement, it cannot 
wholly replace it. 

8. Pinally, it must be emphasized that the order of printing of Prof. Pearson’s paper and 
my owir reflects Prof. Pearson’s generosity rather than the historical order of events. IVLuch 
of his paper was, unknown to me, given in lectures before the war; whereas my work on the 
problem began only in 1943. Since then I have owed much both to Prof. Pearson’s published 
work and to discussions which I have been privileged to have with him. 


EEFEBENCE 

Welch, B. L. (1938). Biometrika, 30, 15S. 



[ 170 ] 


THE OUMULANTS OF THE Z AND OF THE LOGARITHMIC 

AND t DISTRIBUTIONS 

By JOHN WISHART 
School of Agriculture, Gambriige 

Explicit expressions for the exact oumulants of Fisher’s a-distribution do not appear ever 
to have been published. They were therefore worked out, and appear in § 2 of this paper. 
It afterwards appeared that the logical method of presentation was to deal with the similar 
problem for ^logix^jn),* since the ^-distribution involves the simple difference of two such 
functions which are independent. This led to § 1 . Since writing this paper, Bartlett & Kendall 
(1946) have published the same result in the form of the oumulants of logs®, and have given 
graphical and tabular representations for varying n up to 20. The solution is, of course, 
implicit in Cornish & Fisher’s (1937) statement of the moment generating function, while 
Mr 0. R. Rao has informed me that he reached the same result in work done for an M.A. 
Thesis of the University of Calcutta (unpublished). § 1 has accordingly been shortened, 
but is retained in view of the additional formulae to those of Bartlett and Kendall. 


1. The logarithmic x* bistbibution 
The distribution of for degrees of freedom, is given by 

As pointed out by Cornish & Fisher (1937), the mean value of exp log (x^ln)} 

i.e. of exp log (ix®*) - (^n)} 

is the moment generating function of the distribution of ^log (x^jn), namely 

'The cumulant generating function is 

K = login = -^iCog(iw.)H-logrH«-f jl)-logr(^n). 

The- oumulants of the distribution of U^g (X*M) “'r® readily written down by differentiating 
K successively with respect to it and at each stage putting t = 0. We have in fact 


Ai = - Uog ^ log /"( Jn) 


and 


2 

_(-U(s-l)!, 




( 1 ) 


2s. -^(s,a) {s>l,a = in), 

where ^{s, a) denotes the generalized Zeta-function 

j=,o[a +jY 

* All logarithms in this paper are to base e. 



171 


John Wishart 

The cumulants may be readily computed by throwing them into the form 

2X = (2) 

where \lr{x) = d\\ogr{x))jdx, = (?*{log 7'(a:)}/d*«. 

■^{x) is variously called the Psi or Digamma function, and its derivatives have been called 
the Trigamma, Tetragamma, etc. Punctions, and the series the Polygamma Functions. 
These functions have been computed in some considerable detail. For n up to 22 the mean 
and variance can be got from Elinor Pairman’s ‘Tables of the Digamma and Trigamma 
functions’ (1919). Tables up to Pentagamma appear in Vol. i of the British Association’s 
Mathematical Tables (1931), but with certain gaps which, although intended to be bridged 
by reduction formulae, render the tables less generally useful (for n less than 22) than 
H. T. Davis’s Tables (1933, 1935). Table 10 of Vol. i gives all that is required for ^(a;);.in 
Vol. n, Tables 14^16, 18-20, 22-24 and 26-28 cover a wide range up to Hexagamma. . 

As shown by Bartlett & Kendall (1946), the approach to normality is very slow. For 
n - 24 (the limit for of the z table of Fisher & Yates (1943), which provides percentage 
points for the distribution under consideration in the line = oo) the cumulants have been 
worked out to k^, the last being specially computed from its formula given below. The gamma 
ratios are = — 0-296, = 0-174, y^ = —0-164 and y^ = 0-176, and | y \ increases there- 

after at this level of n instead of tending to zero. Approximate percentage points may, 
however, be worked out by using the formulae at the foot of the z table, putting Wj = co. 

For small n, we note that 

S(«) = js = Is + Ja + • • • + (^rziya + 

5W.(i-2-) = (i+i + i + .-+« 


111 1 

— 1 1 L ... J 

1» 3« 6» (2r-T)* 


+ 2-%[s,r + \). 


We thus get, fcyr'n — 2r, 

'C. = ^ - (l + ya + fa+ - + (fnrT)-»)) 

in which the terms in {. . .} reduce to ^{s) for n = 2. 

For n = 2r+l 

X,= (-l)ns-l)!(s(«)(l-2-»)-(l + |; + ^+...+^^,)| (a>l), 


( 3 ) 


(4) 


in which the terms in {. . .} reduce to ^(«) (I — 2“®) for % = 1. 

In the special case o/ s = 1 we have 

^ = 2r /c, = -i(7 + logi«)+i(l-bm-i-... + P^). 

n = 2r+l = -K 7 + log2’i) + |l-l-^-f-^-f- (6) 

For n = 2 and 1 respectively these expressions reduce to the first bracket, ^{s) can be got 
from tables, and in particular 

^(2m) = 2^«^-^n^”^BJ(2m)l 



172 The cumulants of the z and of the logarithmic amd t distributions 

where the B’e are the Bernoulli numbers = -j, B^ = B^ = B^ — B^ = etc. 
y is Euler’s constant. For reference we may quote: 

y = 0-57721 66649, ^(4) = 1-08232 32337, 

C(2) = 1-64493 40668, ^6) = 1-03692 77561 , 

C(3) = 1-20205 69032, ^(6) = 1-01734 30620, 

Note that Lw(C( 5. «) - ~-i) = ^ > ®)' 

For large n, asymptotic formulae for the Zeta-function may be used, and we get 




2n 


111 8,8 64 , 

^ ~ ^ 1 ^ “ 6 ^ 16 ^ “ 3 ^ ’ 






-i)‘p 


(«-2)! (» 
2w»-i 


■ 1 )! , 2 


271“ 


+ - 


rv 


“ (-4y-iB^(.2j + s-2) 

'A 


r 


!> 1 ). 


We may note in passing that not only may this general expression for be applied to the 
special ease a = 1 with the proviso that the first term in that case is dropped, but also that 
<2 may be obtained from Xi + Jlog {\n) by term-by-term differentiation with respect to n, 
and likewise Xg from k^, /c^ from Xg, etc., by similar term-by-term differentiation. This follows 
from a property of the Zeta-funotion. It is therefore not necessary to write down the explicit 
expressions for /Cj, aTj, etc., but we may note that their leading terms are \n-^, — in~^, n-\ 
— 3%“*, 12?i-®, etc., so that the leading terms of and y^ are — and 4/n respectively, 
while 7 , is 0{n-^^). More exactly we have, writing n' = n-l, 

^ ~ 1^ 

with corresponding expressions for Xg, x^, etc., obtained by differentiation with respect to 



Finally, if instead of the distribution of | log (%*/%) we are interested in the distribution of 
log (s®), where s* is an estimate of cr^ based on n degrees of freedom, we have 

log iX^ln) = log log cr^ 
and thus for the distribution of log (a*) we have 

'<^8 = (“1)*(s-1)!^(8,o) (s>l,a = |w), 

while the y ratios are the same as for j^og (x^jn). Obviously log^ a-nd log s can be treated 
similarly. See Bartlett & Kendall (1946). 



John Wishart 


173 


2. The z distribution 

The distribution of z = |^log (sljsfi, where sf and s| are independent estimates of a variance 
0-2, based respectively on and degrees of freedom, is obviously that of 

ilog(A:iK)-ilog(x|/j'2) 

and its cumulants may therefore be at once derived from those of the logarithmic dis- 
tribution. The oumulant generating function is 

K = logM = kii^og{vJvj) + \ogri{Vi+it) + \ogri{v^-it)-logr{iv^)-\ogr{^Vz). 
Turther, we have 


= ^{log" + Lt^^i(^(s,a2)-^(s,ai))j (7) 

= 2-«(a - 1) ! {^(s, cia) -h ( - 1)* ^(s, ai)} (s > 1, aj, = = ^v^). 

Tor computing purposes these may be thrown into the forms 

2k^ = log {vzlvi) + ^{iv^)-ij>{lv^), 

+ (_!)« (S > 1). (8) 

To illustrate, let us take = 24, = 60. We then have from the Polygamma tables (except 

for Xj, which was specially computed) : 

/Cl = - 0-0127 429, /Cg = -0-0007 998, /Cg = -0-0000 104, 

/Cg = 0-0301 992, K-i = 0-0000 867, /Cg = 0-0000 019, 

(r = ^K^ = 0-1737 792, 


■j/j = -0-152, yg = 0-095 (or = 0-023, ydg = 3-095), indicating the degree and nature of 
the departure from normality, -/g and 74 are -0-066 and 0-067 respectively. 

If as a first approximation we assume that for and Vg of the order of the numbers chosen 
in this example, or higher, z is distributed normally with mean and variance given by the 
above formulae, we obtain approximate percentage points, e.g. for the 95 and 6 % points 
we can subtract and add l-6449(r from and to the mean. The result in the present case is to 
give us — 0-299 and 0-273, the correct values being — 0-306 and 0-265. The approximation is 
adequate to almost two figure accuracy, and is evidently useful when we only require to 
know whether an observed z is significant or not. A better approximation is provided by 
the formulae attached to the z tables (see Fisher & Yates (1943)), which yield — 0-3046 and 
0-2663 as against the correct values of —0-3065 (see Thompson (1941)) and 0-2664. 

Explicit algebraic expressions are readily written down for the cumulants for small 
and Vg, using the same method as for the logarithmic distribution. Where it is necessary 
to do so, will be assumed less than In the contrary case we need only interchange 
and i/g, changing the sign of the odd cumulants in so doing. The odd cumulants are zero when 
1^1 = i/g. We have 

Even cumulants (r = 2s, s >0) 

>'i = 2p, J/g = 2q 




174 The CMMulants of the z and of the logarithmic and t distributions 
= 2 p+l, 1>2 = 2<?+l 

= 2(r- 1) ![^(r) (1 - 2-’-) - (^.+ 3^+ - + 

= V2- Drop out the last bracket of terms in the above two cases. 

= 2p, V2 = 29:+ 1 

/c, = (r - 1 ) ! [^r) - + • • • + 

Kj = 2 p + 1 , V2 — 29- Interchange and Vg in this last case. 

Odd mm%lamis (r = 28 + 1 , « > 0) 


+ + )]. (11) 


{Vi-2) 




"h 


(>^1+2)' 






( 10 ) 


1^1 = 2p, Vj = 29, or J'l = 2p+ 1, Va = 29+ 1 


/f, = -(r-l) 


.1^ (»'i+ 


(„. + 2)'+- + K 


1^1 = 2p, 1^2 = 29 + 1 


In the special case of s = 1, we have 
Vj = 2p, ^2 = .29, or Vi = 2 p+l, V2 == 29+ 1 


Vi = 2p, Vj = 29+ 1 





(12) 

11 1 

3 r + 5 r+'" + (^^_2)>-j_ 

• ( 13 ) 

and change the sign of k^. 


( 14 ) 

11 1 \ 
■ 3 + 5 +- + ,,-2)\ 

( 15 ) 


Vi = 2p + 1, Vj, = 29. Interchange and in this last case and change the sign of Ki. 

For large and v^, a combination of the a83rmptotic formulae already given readily yields 
the following results; 




1/1 1 


1 \ - (- 4 )>-r^ ,/l 1\ 

virA i I'f vrr 


the numerical coefficients being as for the of Jlog (y*/»), 

(^~2)! / 1 , (-1)‘\_^(«-1)!/1 ,(-!)*' 

U -' vf-r /+■ 


Ks~- ^ 
« 2 


W vl I 


+ 2S 

i^i 


(- 4 )l-r^,( 2 j + 8-2)!/ 1 


(2i)l 




( 16 ) 


We may put 8 = 1 in /c, provided we drop the first term. We note also that jc^ and higher 
cumulants can be written down immediately by differentiating the terms in Vg and of 
/Cl— I log (i/j/vj) successively with respect to — Vg and respectively. 

These are the results given by Oornish & Fisher ( 1937 ), whose formulae can be extended 
at sight by means of the results of this paper. A first approximation not only gives the 
familiar results 




-2)!/ 1 (-m 

2 >rw’ 


but also the more general 



John Wtshart 175 

but it should be noted that for all s > 1 a second approximation, which takes in an additional 
term, is (s-2)!/ 1 (-l)M 

^ _ 1 ).-x + (llTZip j ■ 

The accuracy of the asymptotic approximation at the limits of the % table given by Fisher 
& Yates (1943) can be seen by applying it to our example (I'j^ = 24, = 60). The numbers 

of terms which are significant in the eighth place (needed for final accuracy to 7 decimal 
places), are three for four for and Xj, and three for X4, Xg and Xg. The first term for Xg, 
namely 12(pj'' + v^^), yields 0-0000 015, rather more than 20 % too low. To use 

12 {(v2-l)'H(ri-l)-n 

would give 0-0000 019, about 2 % too high. 

Should v-^ or be only moderate in size, the other being large, we may make use of the 


relation 




__ ] 

a-’"*' (u+ I)" 


+ ...■ 


j + C{s, a + r), (r an integer). 


{a+r— 1)® 

where a is one-half of the smaller of or Vg, to convert our formulae into forms in which 
asymptotic expansions may be applied to both of the Zeta-functions. We then have (r^ < v^) : 

^ ['°® fe) ^ ^ ■ ('-. * • ■ ■ ■" vT+fcii) ' 


and 




t— 1)!#® 


» ( -l)^-^^^(2j + A-2)! 

(2j)!M2^-i " • 


(17) 


1=1 


Particular cases of some interest arise (i) when r — — and being either both odd 

or both even, and (ii) when r = ^(1^3 - -f 1), I'l (or Vg) being even and Vg (or Vj) odd. In the 
former case the first term within squared brackets in x^ is 2-®(l -|- ( — 1)®) ^(s, ^v^), which is 
zero when s is odd and 2i-«g(s, ^Vg) when s is even. In the latter we have 

which is ^{s, Vg) when s is even. With s odd we are concerned with the difference of two Zeta- 
functions in which the a’s differ by one-half, and the expression may be written 

^ (-1)’ 


and 


3^niv2+jY (s-l)!^^ 

“ ( — 1 )’ p dx 
j^n^’2+j Jo l-f'.'T 


■-1 “ (-1)^ 


on integration by parts 


,.?n2’+i(vg-fj)! 

_ 1 .^ V (-iy~M2">-l)^j 

2vg'^j“i 2jvl^ 

on expansion in powers of This asymptotic expansion is an interesting one in which the 

early coefficients are very simple, for the series is 

1 1 1 1 17 

^ J L 



176 The cumMlants of the z and of the logarithmic ^ and t distributions 

The various cases are set out below: 

Even cumulants (r = 25, s > 0) 

)'i = 2'p, I'a = 2g, or = 2/^+1, Va = 2^+ 1 

=(»'-’ ) ’{■^+■(^7:^ + • • • + (V, _ 2y) 


+ 


(r - 2) ! . (r-1)! 1 ” ( ~ 4)>' g,-(2j + r~ 2) ! 




(2j)!pr 


-1 


( 19 ) 


Vy = 2}),Vz=lq^-\ 

^ = (r- !)!{-, + ... 


(r-2)! (r-l)l 1 ^ ( - i)l^,(2j + r- 2) ! 
+ pr-i + 24 (2j)!vr-i • 


Odd cumulants (r = 2s + 1, s > 0) 

Vj = 2p, 1^2 = 2g, or Vi = 2p+ I, :^2 = 2g+ 1 

Kr = -(r-1)! 

i/j = 2p, 1^2 = ^+ 1 


11 ,1 

+ ,: ■ '^+ ••• +■ 


( 21 ) 


t-r (»'i + 2)^ ir,-2rj’ 

(>'i+2)-- (P2-i)i 

, (r-l)! , i S (-l)^-M2^>-l)P,(2j + r-2)! 

2Pi mvv-^ ' ^ ’ 


Kf = -(r- l)!|-^ + + 


In the special case of s = 1, we have 

Pj = 2p, P2 = 2q, or p^ = 2jp+ 1, P2 = 2q+l 

^■, = Uog(— )-(— + - +...4- — 

“ W Vi ' T +2 4-2/ 

''1 = 2??, = 2(/ + 1 

' - ^W-Vl ^1 + 2 ^ ^V2-1/^2i^ 2 ,-=i 2jvlt 


(23) 


3. The LOGARITHMIC! i-I)ISTRIBUTION 

When t'l = 1 . 2 = log ] i | , and we th us have as a special case for the distribution of log 1 1 1 for 
I'a = ft degrees of freedom : 

2 k, = log « + Lt,_.j {^(5, in) - ^s, 1)} (24) 

^\ogn + i,{l)~ff{ln) iif{\) = -y-2\og2). (25) 


For small n 


II 

?e 

/c,= Jlogm-log2-(2+4 +--+ 2 i_ 2 )' 


ft = 2p -p 1 


( 26 ) 


For large n 




( 27 ) 



John Wishart I77 

Also = (5-1 )!{^(s,4») + (-1)s^(s,|)} (s>1), (28) 

= + = (-l)«(s-l)!(2»-l)^(s)). (29) 

For small n we have the following cases: 

Even cumulants (r = 2s, s > 0) 

,. = 2,, + + ,30) 


Odd cumulants (r = 2s+J, 5>0) 

» = 2p+l ^,_-(,-l)!|l + i + l+„. + j^ 


(31) 


For large n 
/c,~(-l)Mfi-l)!^(s)(l-2-«) + 


(s-2)! , (s-1)! ^ 2 “ (-4)^-iB,.(2^- + s-2)! 


2«.«-i 


"f* 


2«.® 






:>1). 

(32) 


In the special case of n = 00 we have for the distribution of log | a: | , where x is a normal 
variable with zero mean and unit standard deviation ; 

/f, = _|(y+log2), = (-l)«(s-l)l^(s)(l-2-), (33) 

as follows also from the case of ^-log (x^/n) on putting ?i = 1. 


4. Note on the disthibotion approximation 
Fisher’s result that a/( 2;\;^) is approximately normally distributed about a mean of ^(2w — 1 ) 
with unit variance (n being the number of degrees of freedom) is well known. The demon- 
stration depends on showing that the mean value of x is 

/Cl = ^2r^{n+l)ir{\n)'~'^j(n-\) for large n 
and that the variance is « — xf ~ 

but to this order of approximation it is not possible to show that 71 and tend to zero with 
increasing n. A formula for the ratio of the two Gamma functions, developed as far as terms 
in (see Wishart (1926)), gives yi~(2w)-* and = 0(72,-^) (see, for example, Kendall 
(1946)), but owing to the vanishing of the term in of 72 its leading term has so far not been 
accurately obtained, although the exact (but somewhat complicated) expressions for the 
/?! and /?2 of the distribution of 5 = o‘x/V(»+ 1 ) given in an editorial in Biometrika {191&), 
10 , 622. 

1 00 /_ 17-1 (2®' — 11J9' 

Since H^i(» + 1) - filn)) = ^ + ■' ’ 

by the formula given in § 2, we find on integration, and insertion of the appropriate constant, 
that CO / nj(221- 1)5. 

log 1 ) - log r{\n) = \ log (4»i) + 2j(2j-l)«,2/-i’ ’ 


and thus have 


ri{n+l) 

r{kn) 


^{\n) exp - 


4w 



,•=2 ' I 


(34) 


Bioraetrika 34 


12 



178 The cumuhnts of the z and ' of the logarithmic t distributions 

which can readily be expanded to give the additional terms necessary to enable the cumu- 
lants of y (or of^(2x^)) to be worked out (see Johnson & Welch (1939)). Taking ^ih{n~-\)} 
as the first approximation we find 


ri(tr+l) 

mn) 


n 


_ J- 


exp 




1 1 

■i 1 W' 2 ■ 

n 


An? 


+ 


■■■)) 



_ 

16w.(9j,— 1) 



-3.5)^ 


(36) 


thus providing a second approximation 
one-half. The cumulants of ^-re 

/Cj = V(2n-1)^ 


to the ratio of two Gamma functions differing by 


i-_J_ _1- 

4% 


0{n-\ 


The Editorial in Biomdrika (1916), 10 , 523 calls attention in a footnote to ‘Student’s’ 
approximations for the and of the sample .standard deviation. The above formulae 
show that ‘ Student’s ’ results should be 

in which n is now the size of the sample. For w = 10 these give value.s too low by 2 and 5 
respectively in the fourth place of decimals. Practically four-figure accuracy can be attained 
with n as low as 10 if in the terms in n~^ we replace 31/8 by 1 7/4 and 7 /8 by 1 . 


■REFERENCES 

Babtlett, M. S. & Kkndali., D. G. (1946). J.R. Statist. Soc. Suppl. 8, 128. 

Bbitish Association (1931). Mathematical Tables, 1 , 42. London: Cambridge University Press. 
CoENiSH, B. A. & Fishee, R. a. (1937). Rev. Inst. Iniemat. Statist. 4, 1, 

Davis, H. T. (1933, 1936). Tables of the Higher Mathematical Functions, 1 , 1. Indiana; Prinoipia Press. 
Fishee, R. A. & Yates, F. (1943). Statistical Tables, 2nd ed. London; Oliver and Boyd. 

Johnson, N, L. & Weloh, B. L. (1939). Biometrika, 31 , 216. 

Kendall, M. G. (1945). Advanced Theory of Statistics, 1 , 2nd ed., §12'7. London: Griffin and Co. 
Paieman, E. (1919). Tracts for Computers, no. 1. London: Cambridge University Press. 

Thompson, C, M. (1941). Biometrika, 32, 168. 

WraHAKT, J. (1926), Biometrika, 17, 68, 



[ 179 ] 


THE MEANING OF A SIGNIFICANCE LEVEL 


By G. a. BARNARD 


A level of signiticance is a probability. To say that a given result is significant on the 5 % 
level means that some class of events has probability 0‘05. Now whatever theory we may 
hold as to the nature of probability, in order to give a statement of probability a precise 
meaning we must refer to some reference class, or set of data, on which the probability is 
calculated. What is the reference class involved in a level of significance? 

To many people the answer to this question seems simple enough. The reference class 
involved is the set of indefinite (possibly imaginary) repetitions of the experiment which 
gave the result in question. Otherwise put, the data, on which the probability is calculated, 
are the external conditions of the experiment. The following example indicates, however, 
that the meaning of this reference class is not always clear. The example is a modified form 
of one given by Prof. R. A. Fisher in a letter to the author. 

Suppose we have a bag of chrysanthemum seeds, known to give plants having white 
flowers or plants having purple flowers, no other colours being possible. We suspect that 
the proportions of white and purple seeds are equal, and to test this hypothesis we select 
at random ten seeds from the bag, and plant them. Nine of the plants grow to maturity, 
and all of them have white flowers. On what level of significance can we reject the hypothesis 
of equality of proportions ? We may assume that white and purple plants are equally viable. 

It would be natural to argue that, if white and purple flowers were equally likely, the 
probability of our result would be 1 /2®. If there is no reason to suspect an excess of white 
rather than an excess of purple flowers, we must add to this the probability of getting nine 
purple flowers, which is also 1/2®, giving a total probability of 1/2® . The hypothesis of equality 
of proportions would then be rejected on the 1/266, or the 0'3906 % level of significance. 
But if we did this our reference class would not be the set of indefinite repetitions of the 
experiment, in its ordinary meaning. 

A repetition of the experiment, in its ordinary meaning, would consist of another selection 
of ten seeds from the bag, and their planting and growth. On such another occasion all ten 
plants might grow to maturity, or all or some might die. These possibilities have not been 
taken into account in our calculation of probability, so far. 

To allow for the possible variation in the number of plants which grow, we might lay out 
the set of all possible results of the experiment as in Fig. 1, where n denotes the number 
of plants that gro w , and r denotes the excess of white over purple. Thus any point in the figure 
can be referred to uniquely by its co-ordinates (n, r). If we now introduce a parameter p, 
to denote the probability (if it exists) that a plant will grow to maturity, given that it has 
been selected, the probability associated with the point {n, r) on the hypothesis of equality 
of proportions of white and purple will be 


W{n,r\p) = 


10 ! 


n\{l0-n)\ 


p'‘(l — p)^®~" 


WI2-" 




and since this is a function of the unknown p, we have a special problem of arranging the 
points (ft, r) in order of significance before we can establish a test. The situation in this 
respect is similar to that dealt with in the paper on 2 x 2 tables, printed earlier in this issue 
(Barnard, 1946, pp. 123-38 above). 



IgQ The meaning of a significance level 

Proceeding as in the earlier paper, we notice first that the same level of significance must 
apply to {n, r) as to (n, -r), so that we can confine our further considerations to the upper 
half of the diagram. Now in this half, the transition from («, r) to («,+ 1, ?-+ 1) means we 
discover that one of the plants which failed to grow in our case, was in fact a white-flowered 
plant. In this case our conviction that there is an excess of white-flowered plants would be 
strengthened, so that (w -fl , r + 1 ) would be reckoned more significant than (n,r). Similarly, 
going from (ii., r) to («-!- 1, r- 1) would mean that a missing plant was found to be purple, 
and this would weaken our belief in an excess of white-flowered plants; consequently. 


0 1 2 3 4 6 


in 

9 

8 

7 

0 

r, 

4 

3 

2 

1 

0 

1 

2 

3 

4 
r> 

(5 

7 

8 
9 

-10 


7 8 9 10 


Fig. 1 


(n, r) would be reckoned more significant than {n+l,r~l). Finally, going from (n, r) to 
(71-1-2, r) would mean growing two more plants, one purple and one white, and this would 
increase our tendency to believe in the equality of proportions. Consequently, [n, r) would 
be reckoned more .significant than {n -t- 2, r). These principles taken together imply that 
points lying north-east, or west, of a given point (n, r), or between these two directions, 
would be reckoned more significant than (w, r); while, conversely, points lying east to 
south-west (inclusive) from [n, r) would he reckoned less significant than {n, r). The relative 
significance of points lying inside the half-quadrants north-east to east and south-west to 
west would remain undetermined. 

We could now proceed as in the paper (1), building up a test, consistent with the above 
partial ordering, in such a way as to make the significance or otherwise of our result depend 
as little as possible on any knowledge we may have about the v&lue of p. But we need not 
carry this through for the result we have quoted, since our conditions by themselves require 
that the only points in the diagram which should be reckoned not less significant than our 
result are the points (9, 9), (9, - 9), (10, 10) and (10, — 10). The probability associated with 
these four points is 

9;p) = 2(10p9(l-p).2-»-Pp“2-i“) 

= (p/2)9(20-19p), 

the maximum value of which occurs when p = 18/19, and is 9) = 0-002413. Thus on 
this basis we should conclude that our result was significant on the 0-2413 % level. 



G. A. Babnakd 181 

The difference between the first result, 0-3906 %, and the second, 0-2413 %, is in practice 
negligible. Somewhat larger differences will be found in other similar cases, however, and 
it seems worth while to try to clarify the cause of the discrepancy. 

Consider three possible causes for the failure of the tenth plant to grow to maturity: 

( 1 ) The bag from which the seed was taken is known to contain a proportion of dead seeds, 
which are physically indistinguishable from the five ones, and the tenth seed planted 
happened to be one of these. The conditions of growth were such that any five seed planted 
would have grown. 

(2) The tenth plant happened to be attacked by a soil pest, which destroyed it. 

(3) The statistician trod on the tenth plant while running for a bus; otherwise, it would 
have grown. 

If we now consider what would happen in these three oases if the experiment were 
repeated, in case (1) we should be just as uncertain as before how many plants would grow, 
out of those selected. In case (2), we might or might not happen to strike a good year for 
the pest in question, so that we might or might not have a similar accident recurring. In 
case (3) we should obviously give the statistician firm instructions not to be careless, and 
then we could be reasonably certain that all the plants selected would grow.* 

In the first case, we can suppose that the proportions of white, purple, and dead seeds in 
the bag are, respectively, p^, p^, and 1 — (pi+p^)', and the purpose of our experiment is to 
test the hypothesis Pi = Pz- In this case, putting Pi+Pa = p, we can clearly apply the 
analysis of Fig. 1, and the appropriate level of significance is 0-2413 %. 

In the third case, the situation actually realized is just what it would have been if we, had 
warned the statistician beforehand, and then thrown one of the ten seeds back into the bag. 
Thus our effective sample size here is 9, and the appropriate level of significance is 0-3906 % . 

In the second case, the answer depends on our attitude to the set of accidents of which 
the pest is a specimen. If this set of accidents is regarded as a stable set of chance causes 
we may be justified in representing its effect on the growth of our plants by the prob- 
ability p. If, on the other hand, the incidence of such pests undergoes, say, regular cyclical 
fluctuations from year to year, so that its incidence is to some extent predictable, if not 
wholly controllable, then we should not be justified in assuming the existence of a real 
probability corresponding to our parameter p. We should, to be on the safe side, in this case 
allow for the possibility that experimental technique might improve in the future, to such 
an extent as to eliminate the possibility of such accidents. Thus, adopting this conservative 
attitude to our results, we should here treat the effective sample gize as 9. The repetitions 
of the experiment which we have in mind would then be imaginary repetitions, in which 
experimental technique was supposed to be better than it is now, and we have as much 
control over pests as we have over statisticians. 

The general situation illustrated by this example can be described in terms of the notion 
of ‘isolate’ introduced by Prof. H. Levy (1931). In making an experiment, we try to 
construct an isolate — a system, or part of the world, which we suppose has relatively little 
interaction with the rest of the world, and which, for practical purposes, may be considered 
on its own. This isolate may contain within itself all the systems of chance causes which are 

* It is not suggested that the three cases exhaust the multiplicity of types which might arise in 
practice. As Prof. Pearson has pointed out, if it were not the statistician, but his three-year-old son 
who was the vandal in case (3), we should have here a situation intermediate between our second and 
third instances. 



Ig2 mmmng of a significance level 

regarded as affecting, to any practical extent, the results of the experiment. Such is the case 
in (1), where all the chance causes involved in the experiment are supposed given in 
the bag which is the subject of the experiment. Here, then, we are dealing with a ‘good 
isolate’, whose interaction with the rest of the world is really negligible, and chance causes 
operate within the isolate. 

In case (3), on the other hand , we are dealing with an imperfect isolate. The outside world, 
in the shape of the statistician, interacts with our isolate to an extent not negligible in 
practice. Fortunately, in this case we are able to construct a smaller isolate, consisting of 
the nine surviving plants, in which the interactions with the outside world are negligible. 
In case (2), there may be some doubt as to what isolate we are discussing. If we regard soil 
pests and such things as included in the isolate, and represent them as a stable set of chance 
causes, then we are entitled to analyse as in case (1) ; hut if the pests are not included in the 
isolate, we should analyse as in case (3). 

Statistical tests are applicable to at least two types of experiment. First, to experiments 
in which the isolate studied contains within itself a system of chance causes which may 
influence the results. And second, to experiments in which the isolate studied is not a ‘ good’ 
isolate, and the residual interactions with the rest of the world may affect the results. There 
may also be mixed oases. 

The distinction between the two types may also be brought out iii relation to the necessity 
or otherwise of an ‘artiflciaT randomization procedure, using random digits or the like, 
In the first type, such an artificial randomization procedure is not strictly necessary; for 
example, with our bag of seeds, the bag itself, and its physically indistinguishable contents, 
forms a perfectly adequate randomizer. We have in this case, as it were, an impermeable 
shield around the system, which prevents any external shocks from affecting the system, 
In the second type of experiment, we need to ensure that the interactions with the outside 
Avorld will not mask the results we are interested in; and if we cannot ensure a practically 
complete separation from the outside world, then the effect of external intereactions must 
be randomized, by a special procedure. The randomization here acts like a shock absorber, 
specially placed around the experiment to distribute external shocks evenly through the 
system, 

In the first type of experiment, the reference class to which the significance level applies 
is in fact the set of indefinite repetitions of the experiment in question. In the second type 
of experiment, the reference class is an ideal set, in which the accidental influences of the 
outside world repeat themselves exactly, while the effect of these accidents on the system 
varies as a result of the special randomization. 


REFERENCES 

Barnard, G. A. (1946). Signifiotmce tests for 2 x 2 tabloH. Biometrika, 34, 123. 
Levy, H. (1931). The, Universe of Science. London; Watts and Co. 





Vol. XXXIV. Parts III and IV December, 1947 V 


BIOMETRIKA 


A JOURNAL FOR THE STATISTICAL STUDY OF" 
BIOLOGICAL PROBLEMS 


FOUNDED BV 

W. F. R. WELDON, FRANCIS GALTON and KARL PEARSON 

EDITED BY 

EGON S. PEARSON 

IN CONSULTATION WITH 

HARALD ORAM^IR J. B. S. HALDANE 

R. C. GEARY G. M. MORANT 

MAJOR GREENWOOD JOHN WISHART 


ISSUED BY THE BIOMETRIKA OFFICE 
UNIVERSITY COLLEGE, LONDON 
AND PRINTED AT THE 
UNIVERSITY PRESS, CAMBRIDGE 

Reprinted by offset-Utho 1963 


[lasited 30 December, 194:7] 




Volume XXXIV, Parts III and IV 


December 1947 


ON THE DISTRIBUTION OF THE RANK CORRELATION 
COEFFICIENT t WHEN THE VARIATES ARB 
NOT INDEPENDENT 

By WASSILY HOFEDING 


I. Introbtjotion 

1. Consider a population distributed according to two variates x, y. Two members 
(Ki, 2 /i) and (xg, of the population will be called concordant if both values of one member 
are greater than the corresponding values of the other one, that is if 

< *2. yi < 2/2 or % > x^, ?/i > y^. 

They will be called discordant if for one member one value is greater and the other one smaller 
than for the other member, that is if 


< 3^2. Vi > Vi or a'l > < y^. 


The probability p that two members drawn from the population at random without 
replacement are concordant will be called the probability of concordance, the probability 
q that they are discordant will be called the probability of discordance. 

In the following only populations will be considered for which the probabilities of Xi = ajg 
or 2 /i = 2/2 are zero, so that ^ + ( 1 ) 

The main types of such populations ace (a) an infinite population with both x and y 
distributed continuously, (b) a finite population where all values of x and all values of y are 
different among themselves. The condition that the two members are drawn without replace- 
ment is, of course, only relevant in case (6). 

For a sample of n members drawn from the population, the probabilities of concordance 
and discordance are defined in the same manner as for the population. They will be denoted 
by p' and q' to distinguish them from the population values. If for the population (1) is 
fulfilled, it may be assumed that all values of x and all values of y in the sample a.re different, 
so. that + = (2) 


It follows from the definition that p' is the relative frequency of concordant pairs among 
the pairs which can be formed from the members of the sample. 


The probability of concordance expresses an essential property of a bivariate distribution. 
It may in itself be considered as a measure of correlation, p' is an estimate of p; it will be 
shown that the mean value of p' is p. If a coefficient lying between the limits — 1 and -M is 
preferred, the quantity r == p' - ^r' = 2p' - 1 (3) 

may be taken. 


2. The quantity p hero termed the probability of concordance was apparently first 
considered by Esscher (1924) who also used the quantity 

n-r-1 TV 

S S sign {Xi - Xf) sign [y^ -yf), 

j=H=j+l 


D = 


Biometrika 34 


13 



184 Distribution of the ranh correlation coefficient r 

(where Xi, 2/i, i = 1, sample values of the variates) which is the same as the 

ooefacient t as defined by (3). Esscher showed that if x and y are normally correlated with 
correlation coefficient r, the expectation of i) = v is 

2 

E{r) =~sya-^r. ( 4 ) 

Hence, from this equation, hesuggestedestimatingrfromranked data by means of therelation 


r = sin^T = sin7r(^3' — ^). 


For the variance of t Esscher found in the case of a normally distributed population 


where 



var 


l-^^sin-^rj . 


(5) 


While Esscher saw inp' and D — r only a means for estimating r, Lindeberg (1926) stressed 
the.signifioance of the probability of concordance itself for judging the degree of dependence 
between the variates. He proposed for that purpose the coefficient 


P= 100p'-60 = 60 t, 


called by him Korrelationaprozent. Lindeberg also gave, without proof, a formula for the 
variance ofp' in the general case of correlated variates (see (13) below). 

Jordan (1927) suggested using, instead of Lindeberg’s P, the coefficient later termed by 
Kendall t. 

Kendall (1938), independently of the above authors, proposed t as a measure of rank 
correlation. He completely solved the problem of the sampling distribution of r in a universe 
in which all possible rankings are equally probable, showing that it rapidly tends to 
normality for increasing n. 


3. The main object of this paper is to show that the samplihg distribution ofp' (and hence 
that of t) tends to normality as n-s^oo for any population with continuously distributed x 
and 1 / if a certain condition is fulfilled (Part IV). In addition, Lindeberg’s formula for 
the variance of p' is proved (Part II) and extended for a finite population (Part V) . Finally, 
in Part VI the problem of estimating var (p') from the sample is considered. 


11. Mean value ano variance oe p' in the case oe an in^vnite population 

4. Consider a sample of n drawn at random from an infinite population with continuous 

X and y. Replace the values of x and of y in the sample by their ranlcs and arrange the 

members of the sample so that the ranking of a: is 1, 2, Then the ranldng of y is a 

permutation tt / \ 

.r+L n , H = (7ri,...,7rJ 

01 the numbers 

Let I and J be the numbers of inversions in the permutations {rr^,rr. 

(tt^, ...,7r,J. Then 2 j 2 J 


«-i, •••.T^i) and 


7l(?l-l)’ “ 7l(7l-l)' 

Thus the knowledge of the permutation H corresponding to the given sample is sufficient 
for evaluating p'. 



Wassily Hoyb'ding 


185 


6. Let P(n) be the probability of drawing a random sample represented by the per- 
mutation n. Let p'(n) be the probability of concordance for such a sample. Then 

p := sp(n)p'(n). (7) 

where the sum is extended over all permutations 11 of -a numbers. 

The right-hand side of (7) is equal to the mean value ofp'. Hence 

Ep' = p. (8) 


Consider, in generalization of p', the probabihty w' that among m^n members drawn 
from the sample at random without replacement, certain pairs of members are concordant; 
for instance, among four members A, B, 0, D, the pairs AB, AQ, AD; or the pairs AB, GD, 
etc. Let w be the corresponding probability for the parent population. Then it is seen in 
the same manner as with p' that 


Thus, if we can express (p'Y, the probability of drawing fi concordant pairs from the 
sample, replacing each pair after drawing it, by probabilities without replacement of the 
type w' , we can also, in virtue of (9), represent E(p'Y by population parameters of the type w, 

6. Now, {p'Y, the probability of drawing from the sample one concordant pair and, after 
replacing it, of drawing again a concordant pair, is the sum of the following three probabilities : 


(a) the probabihty of getting the same pair in both drawings 


(l , multiplied by 


the probabihty that this pair is concordant {p'); 

(b) the probability that the second pair has one member in common with the first pair 

^(n~2) I I multiplied by the probability, say k', that among three members A, B, 0 

drawn from the sample without replacement, one, say A, is concordant with the other two; 
• (c) the probability that the second pair has no member in common with the first one 



multiplied by the probabihty that among four members A, B, G, D drawn 


without replacement, two pairs without a member in common, say AB and GD, are con- 
cordant. The latter probabihty may be denoted by (p^)' since the corresponding probabihty 
for the infinite population is p^. 


Thus, 

1^^{P'Y = P' + '^{'n,-2)k' + ^ ^ ^ 

){P^)', 

(10) 

and, applying (9), 

( 2 ) E{p'Y = p + 2{n-2)k+{^~^- 

y 

(11) 


Hence, we have for the variance of p' 


l^)y^r{p')==l^^{E{p'Y-p^}^p + 2{n-2)k-{2n-S)p^ ( 12 ) 

or i^^^a,v{p')==l^^fii{p')=p{l-p) + 2{n~2){k~p'^). (13) 

This is identical with the formula given without proof by Lindeberg (1926). 

7. In the case considered by KendaU where all permutations 11 of w numbers are equally 
probable, the permutations of m < ti also are equiprobable. Hence 

p = P(l,2) = g= P(2,l) = i. 


13-2 



186 Distribution of the rank correlation coefficient r 

Further, representing k as the mean value of k' in a sample of 3, we find 

k = P(123) + iP(132) + iP(213) = (1 + Hi)^ = 
Inserting these values in (13), we have 


var (p') 


2»+5 

18n{n-iy 


var (t) = 4 var (p') = 


2(2»+5) 


in accordance with Kendall’s formula. 


III. Some alqebkaic fokmulae 




or 


dW = a--. ^ i (a + Z?)” - S 




A=0 


',v— A 




Hence we find by induction 

If we take this as definition of for k < 0, we have in virtue of (14) 

= 0 for A < 0. 

Expanding {oc+pY .we have from (19) 


pi p=0 


Pj (r=0 xOj 




(14) 


8. We shall now consider some algebraic relations to be used in the proof of normality 
of p' for large n. 

Let/^(p) be a polynomial of degree d in p. Then 

0 if d < /?, 
if d = /?, 

where is the coefficient of the highest power p'* in f^{p). 

To prove (14) write /^(p) = Uopi^® +aipW-U + . . . , 

where pWi = 1, p^^^ = p(p-l)...(p-d4-l), (d^l). 

Then (14) follows from the fact that 

s ( - 1)^-^ M pt^i = ( - 1)^-^ ~ I 

P“0 \P/ p—g=0 \P c, 

is equal to ~ 1)^-* = 0 if /?- (y> 0 and to /?! if /? = d. 

9. For any non-negative integer v we may write 

n” = [n - a)M -h d(5* {n - a)'"-!' -h . . . -i- d'“^_i {n-oc) + d[j>. 

We will study certain properties of the coefficients 
From (16) it is seen immediately that 

d%'= 1. 

Inserting in (16) ?i = a+yS (/? = 0, 1, ...) we have 

(ci+fY = dW_^^! + 


(15) 

(16) 

(17) 

(18) 

(19) 

( 20 ) 


=oW ffi 


1 ^ /A 

— S (-i)^-^r p"-"' 

p = 0 \P/ 



Wassily Hofi'ding 
Comparing the last sum with (19) and writing 

d,, = dfl, 

we have 

or, putting v- /? = /c and noting that, by (20), = 0 for o-> /c. 


187 


(21) 


We have the recurrence relation 

4+1,*: - = (a + V + 1 - at) 4“’ _i (22) 

which can be obtained by multiplying (15) by n, then writing down (16) with 1' + 1 instead 
of V, and comparing ooelhcients in both expressions. 

10. We prove now two properties of the coefficients 4“'. 

(I) 4k is a polynomial in v of degree 2k, the term of highest degree being v^I2‘‘kI. 

In virtue of ( 1 6) this is true for a: = 0. And if it is true for a: - 1 , the highest term of 4+i,*: ~ 4k 
is, by (22) with a = 0, — and hence that of 4 k> hy a well-known theorem. 


. f;2K _ , 


,,2k 


2K2*-i(Ar-l)! ^ 

(II) is a polynomial in t of degree 2a: with the highest term ( — 1 )'' j 2 “ k ! . 
From (21), 4r/> = 


In j the highest term 


in t is — , f'. 
cr! 


In 4_p-<r,K-,r tiie highest term in t is (I))- 

In (y — ty the highest term in t is ( — 1)‘^<”'. 

Hence, in 4-pJi the highest term is 

y (~^)‘^ {2k (2 _ 21* ^ ~ {2k 

„"o2*«r!(x-<r)! 2*x!' > 2*x! ' 


fl. 4 ,v -/9 l^ns also a combinatorial meaning. 

v! 


Let 


Sk(4 = s- 


, s;(v) = s'- 


v\ 


vyv^\...Vp\' Vi! Vj! ... 

where S indicates summation over all 0, S' over all Vj > 1, andinboth cases Vi-t- ... + v^ = i'. 
2^ [v) is the number of ways of allocating v objects on /? places, and S^ {v) is the number of 
ways of allocating v objects on places in such a manner that no place remains empty. 

We have Syj (v) = yff’’, 

and a little consideration shows that 


s^ (V) = s; ( 4 + (,)+... + (^ ^ Si 


( 4 - 


Comparing this with (17) we see that 

_ I ■g/ /„\ - I V' ]!l 


d., 


(23) 



188 


Distribution of the, rank correlation coefficient r 


IV. Peooi’ of normality of p' for k->oo 

12. Any set of different pairs of elements belonging to the population will be briefly referred 
to as a system (two pairs being different if they have no more than one element in common). 

If we represent the elements of a system by points in a plane and the pairs of elements by 
lines joining the points, we h&ve a pattern corresponding to the given system. Two systems 
will be said to have the same pattern if there exists a one-to-one correspondence between the 
elements of both systems such that if two elements of one system form a pair, the two corre- 
sponding elements of the other system also form a pair. Thus the only thing relevant in a 
pattern is the lines connecting the points, the position of the points having no significance. 

A pattern will be called simple if one can pass from any point of the pattern to any other 
one along lines belonging to the pattern. A composite pattern is a pattern consisting of more 
than one simple pattern. 

If the elements of a system (or the points of the corresponding pattern) are denoted 
by different letters A, S, C, ..., each pair of the system can be represented by a pair of 
letters. All systems of one pair have the same pattern [AB). There are two patterns 
of two pairs, one simple and containing three points {AB, BO) and one composite and 
containing four points (AB, CD). There are five patterns of three pairs, three simple 
{AB, BC, OA; AB, BC, CD', AB,AG,AD), one consisting of two different simple patterns 
(AB, CD,DE) and one consisting of three equal simple patterns {AB, GD, EF). 

13. If a simple pattern consists of points and bj pairs, 

(24) 

For this is true for = 1, and by adding one pair to a simple pattern, at most one point 
is added if the new pattern is to be simple again. 

Denote the different simple patterns by 8i,S^, where 8^^ stands for the one-pair 

pattern and 8^ for the two-pairs pattern {AB, BC), all 8j withy > 3 consisting of three or 
more pairs. Let a,- be the number of points and the number of pairs In 8^. Then = 2, 

bi=l, 63 = 2, if j>3. (26) 

Consider a pattern P composed of simple patterns 8^, y^ simple patterns /Sg, etc., and 
containing a points and h pairs. Then, writing symbolically 

we have a^^Ly^ap b = T,yjbj. 

In virtue of (24), 3b — 2a = S yfZh^ — 2ay) ^ S yj{bj - 2), 
and from (25) 3b — 2a > — (26) 

the sign of equality holding if, and only if, pattern P contains no other simple patterns than 

and 8^. 

14. {p'Y is the probabihty that p pairs of elements drawn from the sample, replacing 
each pair after drawing, are all concordant. We may write 

kpy = 

where is the probability that p pairs are drawn from the sample in such a way that the 
system of different pairs among them has the pattern P^, and if is the number of points 
in Pp is the probability that if elements are drawn from the sample without replacement 



Wassily HoOTBiNa 189 

and paired according to patterni^, all pairs of are concordant. The.summation is extended 
over all patterns with no more than [i pairs. 

Since the probabilities w'i are of the type for which formula (9) is apphcable, we have 

E{py = (27) 

where, as usual, w^ is the population probability corresponding to the sample probability w\. 

15. Consider a term Aw in (27) corresponding to the pattern 

withy = simple patterns, a = Sy^a,- points and b = Sy^6^ pairs. 

Let P = Sy.A 

be the pattern obtained from P by excluding the single-pair patterns S^. Then 

y = y — yj, a = a—2y.y and h = 6— y^ (28) 

are the numbers of simple patterns, points and pairs in P. 

We have w == priv, (29) 

where v is independent of p and y^, only depending on the pattern P, 


16. The probabdity A will be studied, in the first place, as a function of n and y^, while 
its dependence on P will be considered later and only in a special case. It must be borne in 
mind that, by (28), y, a and 6 also depend on y^. 

Let (3i, Qi, ...,Qb be the pairs of pattern P numbered in some definite order. Suppose 
pair appears times (/? = 1, Then 


fii + ... + fif, ~ /i, {^ — 1 ,..., 6 ). 

Let Ri,Rz, ...,Rf^ be the total set of the pairs drawn, numbered independently of the 
order in which they appear. Then R’s are equal to P’s are equal to Q^, etc. 

Let B be the probabihty that among n pairs drawn from the sample, replacing each pair 
after drawing, b pairs are different and arranged according to pattern P, pair Qp appearing 
times (/? = 1, . . . , 6) and the /t pairs being drawn in a definite order, say P^, R^, . ..,R^. 
Suppose, R^ is a Q^. Since any pair drawn may be taken as (only the relative position 
of the pairs being relevant), the probabihty first to draw Rj^ is 1. The probabihty that the 
second pair drawn is R^ depends on whether R^ has no, one or both elements in common 


with Rj^. In the first case, it is 


Pa), and in the third case, 1 


n — 2\ 
2 


(the factor 2 


m , in the second case 2{n - 2) j 
arising from the fact that each of the two elements of P^ can be the element common with 

V 

In general, if the first A pairs drawn are P^, ...,R;^, and if they form a pattern P' containing 
a different elements, the probabihty that the (A + 1 )th pair drawn is depends on whether 

^ , .... , . ... -r., ^ , ... ,n — oC\ 


^+1 has no, one or both elements in common with P'. In the first case it is 




in the second case, c'{n — a) 


and in the third case, c' 


independent of n. If, in the last case,-P;^+i is equal to one of the preceding P’s, c" = 1. 



Distrihution of the rank correlation coefficient r 

£ is the product of all fi such probabilities, and it is seen from the above consideration 
that it is of the form 5 ^ 0(n- , 

where 0 is independent of n. 

We also see that a pair which has already appeared before makes no contribution to G. 
Hence, 0 only depends on the different pairs of pattern P, and is independent of the 
numbers /i^. 

The above reflexion further shows that for any simple pattern contained in P, the pair 
drawn first, having no elements in common with the preceding pairs, contributes to G the 
factor except for the first pair, R^, which yields the factor 1, Thus, G contains the factor 
2 -y+i - 2 - 1 'i-v+i, and is obviously the sole contribution to G from the y-^ single pairs 
(pattern S-^ contained in P. Hence 

£ = 2-nG"(n-2)C“-2ir \ 

where C' is independent of n and and also independent of the order in which the y^ 
single pairs are drawn. 

A, the probability that pairs drawn form pattern P, irrespective of the order in which 
they appear, depends on n in the same way as P. As a function of y^^, A contains, besides 
the factor Ijyii owing to the fact that the y^ single pairs are interchangeable, Turther 
it contains the factor which indicates the number of ways of allocating /i objects on 
b places so that no place remains empty. In virtue of (23) we have 

Thus, A is of the form A = i)^l(«— ^ (30) 

where (31) 

and D' is independent of both n and y^ and only depends on the pattern P containing no 
Inserting (29) and (30) in (27), we have 


/j-i 


(32) 


E{p'Y = 2)f^n+“-2i^ 

the summation taking place over all patterns with no more than n pairs. (32) also holds 
for /« = 0 if by a ‘pattern of 0 pairs ’ we understand the case yi = 7^ = ^ and take (31) 

as definition of P® with suitably chosen D'. with 0 is defined by 

p[~*i(p + 5)W5 = 1. 

17. If mAp') = E{p'-pY, 


we have 

Applying (32) with 


«-o 




v-S-l 


E(p')> 


p-S 


n = v~d, yi = /c-c 


( 33 ) 


2(-i)* 

^=0 


we have for the coefficient of “ (o) 




= S(-i)* 

^3*0 


V 

Sf2^ 


^ i (- 1)42 

/)=*0 \ P / 





Wassily Hoi'I’dinQ' 191 

Inserting here, in accordance with (16), 

ft2'S-p = 2 ^ (n-a-2K + 

ir=0 ’ 

V /i,\ 1 S- 2S-P 

wehave S (-1)*L ts S (-1)'’ S _ 2)[5+2-c-2-p-o-]_ 

i=0 \0/ >4 p=o \p/ CT=0 

Putting a = d+2x-2-p—cr, we have for the coefficient of p'^v(n-2/‘‘^ in 
(2) (34) 

where aSr.*‘)(4 = (35) 

Since 4°-pii^+2K-a-p-a = 0iftt + 2/c-a— /? — 2<0, the upper limit of p in the summation 
may be taken asa + 2/c — a — 2, wliich is independent of S. We have then, in virtue of (31), 

aM(^) = ^^(6 + ^-4M4_,,,_5_,i)' (-l)^Q4t^;.5^ra-p-2, (36) 

where D' is independent of (i. 

In virtue of (I) and (II), para. 10, 4*’’“^(^) is a polynomial in S. The degree of the (p + l)th 
term in the sum in (36) is p + 2(S+2/c— a-p-2), which is highest for p - 0. Hence, the 
degree d of ap‘‘>(d) is 

d — d + 2(i^ — S — K) + 2(d+2K—/z—2) = 2(v + a+K— a- 2)-6. (37) 

Now, according to (34) and (14), = 0 if d< v, or, in virtue of (37), 

£()'.<») = 0 if 2a> V— 4 + 2a — & + 2/C. (38) 

Applying (26) for pattern P, we have, since = 0, 25^ 36, and consequently 

2a — 6 A 2 k ^ 2(6 -}- yr), 

the sign of equality holding if and only if pattern P contains no other simple patterns 
than Nj and Ng. 

Remembering that, according to para. 16, we have in virtue of (33) 

6 + /C = 6 + 71+^ = 6 + (J</4 + ^ = p. - (39) 

Thus in any case 2a— 6 + 2a: < 2v 

and, in virtue of (38), = 0 if a^fv-l. (40) 

If P contains at least one simple pattern with more than two pairs, we even have 

2a-6 + 2yc<2r', 

and consequently = 0 if a^fp— 2. (41) 

Prom (40) it appears that the degree in n of I ^ I Py(p') is 

<3h— 2 = |v-2 if V = 2h, 

<3A-l=|i'-f if v = 211 + 1. 

Thus, in Pih+\{p')> if expanded in powers of n, the degree of the highest term is 

<3h-l-4h = -h— 1. 



192 


Distribution of the rank correlation coefficient t 

In virtue of (13), the degree of the highest term is - 1, provided that 

Hence, the degree of ^ 

is < - A - 1 + /i + = — J. It followa that 

a2A+i(/)^0 if k-p^>0. (42) 

(A: < 0 is impossible since in this case var (p') would become < 0 for large n.) 


18. As we have seen, we may write 

I Vl\ 

(2) + + .... 


Then it follows from (41) that Rj^ only contains terms depending on patterns S-^ and 8^, 
that is, A is of the form 

= (43) 


The only terms in 


2 


E{p'Y which can contribute to this sum are of the form 

pr^k^n — 2)f*®')'i+3A— 2j_ 

The pattern corresponding to such a term is 

P = yi/Sj + A^Sg. 


Remembering the considerations in para. 16, we see that in each 8^ the pair drawn first 
contributes to C the factor J (except if it is Rj), while the first drawing of the other pair 
yields the factor 2, Hence, the 8fB make no contribution to 0, and we have 


G — 2-'ri+i. 


The contribution of the patterns 8^ to A is twofold: since in each 8^ the two pairs may be 
interchanged, this gives the factor (1/2!)^; and since the A patterns 8^ may be interchanged, 
we have the factor 1/A!. Thus 


H<") = 2-yi-w 


(yi + 2A)! 

7i!A! 


and, in virtue of (31), since b = 2A, 

D' = — i — 

2'i-iA!' 


(44) 


Inserting in (38) v = 2h, a = 3A.-2, a = 3A, b = 2A, (45) 

we see that 4= 0 is possible only if 



AC + 2A > 2h. 


On the other hand, from (39), 

K+2X^2h. 


Hence, 

K = 2h- 2A. 

(46) 

Inserting this in (43), we have 



Rh 


(47) 




where 



Wassily Hotoding 


193 


According to (37) in connexion with (46), (46), the degree in S of is 2h. The 

highest term, is contained in the term corresponding to p = 0 in (36). Inserting in 
(36) the values from (44), (46) and (46) and putting in the sum p = 0, we have 

Thus, in virtue of (16) and (II), para. 10, 


. rMh-K-TiS) 


1 2aA-2A 

= 2-2'*+^+!-- f - f 




2»-iA! U/‘ 


According to (14), 


g(2;.3A-2^ (-1)^-^(2A)! /A' 


'■2(A-A>,A 




A ■ 


Inserting this in (47) we have 


^ 2'‘-iA!a=o ■ ^ 




The highest term of is thus 


i(2A)! n._^z\h^-h 


2^''- - (fc —p^Y'-n' 


that of P 2 (p') is 4:{h —p^) n-^, and hence 


« *-?*>»• 


(48) 


li^ip') » 2%\ 

From (42) and (48) it follows according to the Second Limit Theorem that the distribution 
ofp' tends to normality as to-^oo, provided that the marginal distributions are continuous 
and h — p® > 0. 

The condition fc— 0 is fulfilled if the population is distributed normally. For, com- 
paring Esscher’s formula (5) with (13), we find, since var(T) = 4var (p'), 

= ^-|^ sin-1 


n 


The right-hand side is positive if | r | < 1. 


V. The varianoe oe p' m the case of a finite population 

19. Consider a sample of n drawn from a finite population of N in which all values of x 
and all values of y are different. For the sample probabilities p' , ¥, (p®)', ... we write now 

p(n)^ {p^Y^\ ..., 

and for the corresponding population probabilities 

pW UN)^ (p2)W, .... 

Equation (9) remains vaUd and may be written as follows: 

Ev^n) ^ ypn, (49) 

In particular, BpM~~ pl'^\ 

The essential difference between this case and the case N = oo considered above is that the 
composite probabilities such as or {pkY^ are not equal to (pW)® or For 



194 


Distribution of the rank correlation coefficient r 

instance, is evidently the same function of and N as (p')^ is of k', 

{p^y and n. Thus we have, replacing w by ^ in (10), 

= (60) 

and hence (p.)(W) = 

On the other hand, from (10) and (49) 

= (62) 

which is the equivalent of equation (11). 

On subtracting (50] from (52) we Rnd 

var {p’-^'>) = ^ -^(N + n-l )])fcW 

-[2Nn-^N+n-l)]{p^f^}. (53) 

Substituting for (p*®)*^*, the expression in (51), we obtain 

+ 2(n-2) (N- 2) (i^^>-pm% (54) 

or ( 2 jvar(p(")) = (l-^^^||||;^jp(W(i_^(JV)) + 2 (w- 2 )(l-^~j(^^- 2 )(W). (66) 

For N-^co, (66) becomes the same as (13). 


VI. A SAMPLE ESTIMATE OE Var (p') 

20. In the case of an infinite population, lei 

(^^jvar' (p') := p' + 2(n~2)k'-(2n-3)(p^y. (56) 

Then, in virtue of (9) and (12), 

E var' {p') = var (p'). 

On inserting in (66) for (p^)' the expression obtained from (10), we find 
/ra-2\ 

( 2 J^^i’'(P')=f)'(l-J>') + 2(«-2)(^'-(p')a), 


or 


var' (p') = 


(w-2)(«,-3)^'* +»-3 


dk'-ip'n 


(57) 


By analogy, var' (q') = ^ “ im (58) 

where I is the probability that among three members A,B,0 drawn from the sample without 
replacement, one, say 4, is discordant with the other two. 

In the case of a finite population of the type considered in para. 19, we define in a similar 
way a statistic vad™^ (pbC) such ^j^at 


J5 var<“> (p<"’)) = var(p<’‘^). 



Wassily Hoyflihg 


195 


We find 

2(N~n) 

var(«)(pW) = 

(69) 

A comparison between (59) and (64) shows that var^”^ is obtained from var (pW) 
by interchanging n and JV and talcing the opposite sign. 


21. Let and be the numbers of sample members concordant and discordant with 
(v - 1. .-.in). The probability of drawing first the member A^, and then, 

without replacing it, a member concordant with is The probability of 


drawing, without replacement, first A„ and then two other members concordant with A^ is 
Hence 

w('n,-l)(w~2) 

P' 


Similarly, 


n{n-l)’ 

7v_ 

n(n~l){n-2y 

(60) 

S/v_ 

n{n—l)’ 

^i(n-l)(n-2)' 

(61) 


If only the value of p' or q' is required, the use of (6) may be more expedient than that of 
(60) or (61). If, however, the variance, and hence k' or l\ is wanted, the calculation by means 
of the numbers and \ (whose sums are twice the numbers of inversions I, J) according 
to (60) or (61) is to be preferred. 

If p' > a', it is more convenient to calculate q' and I' from (61 ) ; if p' < |, the calculation of 
f and k' by (60) is more rapid. In many cases one can see directly from the given data 
whether the concordant or the discordant pairs prevail, before actually calculating p' or q'. 

Since p' +q' ~ 1, we have var (p') = var (q'), 


and also, in the case ofu finite population, 

var(p99) = var (g^”>). 

If we write down the equation for var (g^"^) analogous to (5,5) and subtract it from (65), 
we have _ liN) ^ 

or = pW - = p™ - 

Substituting n for N, we have 

k'-V ^p'-q' =r. 


Comparing this with (67) and (58) we see that 

var'(p') = var'(g'). 



196 


I 


REFERENCES 

EssmJ, F. (Iffl)' 0« » of ^ *!>“ wistes, ShA 

ite». 1,201-19. 

JOBW, Cs. (1221). FfaliiifUtMUwiffi. Paris; Gsitliiers-Vilto!, 
toffl, M. G. (1238). A new meaanifl of talk comlation, Biowfrifa, 30, 81-93, 

Lbbusbo, J. ff. (1920). Rota Oio Korolation. ft» Ff ihAimmb Malawtkrlmfit i JUtti, 
km, 21 AnfosM Soptabor 1925, pp. 131-i5' Kobentovn: J, Gjolbnip. 


ADDEroU 

On p, 184 itove, 1 5110 W J, W. Weberg as kving given the fomiila for the Tarience 
of the probability of concordance p' mthont proof, I fas not aware then that a proof of 
this forintila, as wel as that of the correspding expression for a finite population 
(egnation (54) of my paper), is contained in another paper by Lindeberg, ‘Some remarts 
on the mean error of the percentage of correlation,’ Uii MsU hnd, 1, 137-41 
(1929), 


[ 197 ] 


THE SIGNIFICANCE OF RANK CORRELATIONS WHERE 
PARENTAL CORRELATION EXISTS 

By H. E. DANIELS {Wool Industries Research Association) 
and M. G. KENDALL 


1. All the known tests of significance of rank correlation coefficients are based on dis- 
tributions from a population in which each possible ranking occurs equally frequently, 
i.e.- the null case where no parental correlation exists. We may then say of any particular 
coefficient whether it is significant in the sense that it cannot have arisen with any acceptable 
probability from an uncorrelated population. No tests are known in the case where parental 
correlation exists, and we have not seen the point discussed except in reference to the 
replacement of rank correlations by grade or product-moment correlations. Thus, for 
example, if two rank correlation coefficients are both found to be significant there has 
hitherto been no exact method of deciding whether their difference is significant. In this 
paper we consider the problem of determining confidence intervals for a rank correlation 
when the parent is correlated and develop a test of significance for the difference of two 
correlations. 

2. In testing an ordinary product-moment correlation the problem is enormously 
simplified by the assumption that the population is normal, or the further assumption that 
normal theory holds good even when the parent deviates only moderately from normality. 
Apart from means and variances the population is then completely specified by the single 
parent parameter p and, as is well known, the sample distribution of the estimator depends 
only on p and the sample number n. 

In ranking theory this position no longer obtains. No assumption can in general be made 
about the form of the parent distribution and, in particular, the parent correlation does not 
completely specify the problem. The usual type of variate theory cannot, therefore, be 
expected to meet the requirements. 


3. A satisfactory approach to the problem can, however, be made if the rank correlation 
is measured by the coefficient known as t (Keiidall, 1943, chap. 16). We shall then show that, 
for large samples at any rate, the problem admits of a solution. 

Let the population consist of N members. They may be imagined as laid out in the natural 
order 1,2,. .., A according to the first variate. The rankings according to the second variate 
are then some permutation of the numbers 1 to A, and this second array of ranks is aU we 
need write down in particular oases. It determines the rank correlation r . Now suppose 


we choose a sample of ft in one of the 



possible ways. This sample will, so far as the first 


variate is concerned, be in the natural order, and the ranks according to the second variate 
permit of the calculation of a sample correlation t. For all possible samples and any given 


arrangement of the parent members there will be a distribution of 



values of t. 



198 Significance of ranh correlations where parental correlation exists 

4. The sample value of t is an unbiased estimator of t; that is to say, the mean value of 

t in all possible samples is t. For consider the I j samples of n. Any particular pair of 

^ I samples, that is, all pairs occur equally frequently in the 

totality of all samples. In calculating t we assign to any pair + 1 if its members are in the 

right order and - 1 in the contrary case. Thus the total of the score for all samples is ^ j 

times the score for the population. To obtain t we divide the score for any sample by ^n{n - 1), 
and to obtain t we divide the population score by IN{N - 1). Hence if S is the score for the 
population, the mean value (expectation) of t is 


E{t) = 


S 


«’>-*)(«) 


mN-i) 


= T. 


(4-1) 


\ / 

5. Unfortunately, it is not true that higher moments of t depend only on t. A single 
example will illustrate the pomt. Consider the ranking of 9 : 


5 2316789 4. 


If the 84 = 


possible samples of three are written down and t evaluated for each, the 


distribution of S (the number of positive pairs) is found to be as follows; 


Values of <S 

Frequency 

0 

2 

1 

15 

2 

34 

3 

33 

Total 

84 


The mean of this distribution is 182/84 = 13/6, and since 


* Mn-1) 

the mean value of I is (26/18) — 1 = 0-44. The value of 8 for the parent ranking is 26 and hence 
T = (52/36) - 1 = 0-44, verifying equation (4-1). The ranking 

125936784 

also has r = 0-44, but the distribution of S' in samples of three is now: 


Values of iS 

Frequency 

0 

3 

1 • 

16 

2 

20 

3 

36 

Total 

84 





H. E. Daniels and M. G. Kendall 199 

The second moment of this distribution is 6-429, against 6-333 for the first distribution, the 
variances being 0-734 against 0-639. 

6. Thus for any parent with given t there is in general more, than one sampling distribution 
of { according to the arrangement of the parent ranks. In short, as mentioned above, the 
parameter t does not completely specify the sampling distribution and in asking the question : 
What is the standard error of i ? we are seeking for an answer which does not exist. 

It wiU be shown, however, that for any given parent ranking the distribution off tends to 
normality with increasing n. The sampling properties of t can therefore be specified to a 
first approximation by its first and second moments only, when the samples are not too small. 
Further, it will be proved that for given t the variance of t cannot exceed a certain function 
of T and n whatever the parent ranking. From a knowledge of t and n only, it is thus possible 
to set outer bounds to confidence intervals for t provided n is large enough for the normal 
approximation to hold. The hmits obtained in this way are sometimes rather wide, and an 
alternative procedure is to estimate the true variance of t directly from the sample itself 
according to a formula given below. This avoids the loss of efficiency consequent on using an 
upper limit to the variance, but it is not known how large a sample is required for the error 
of estimation to be tolerable. 


7. The development of the theory is facilitated if we introduce at the present stage a 
notation similar to that used by Daniels (1944). The ith and jth ranks corresponding to the 
second variate are together assigned a score which takes the value -f 1 if the members 
are in the correct order, — 1 if in the wrong order, and % is defined to be zero. The ranks for 
the first variate are similarly assigned scores 6^^, but as the members have been taken in the 
correct order for this variate, the scores are simply b^j = ± 1, » > j; = 0. Next we define 
Cy = so that Cy = ± 1, according to whether the ranks for the two variates agree or 
differ in order, and Cy = 0. In this notation 

T = clN{N- 1), 

where c = Scy, i and j both being summed from 1 to N. 

When the sample of n pairs is selected at random from the parent N and its coefficient 
t is calculated, the values of Cy for the members of the sample remain the same as in the 
population. This fact makes t much more suitable for the present problem than the Spearman 
coefficient p whose associated scores do not possess the same property. The sample rank 
correlation is then ^ = cWMn-l), 

where = S<”Vy and SW denotes summation only over those values of i and j occurring 
in the sample. 


8. It has already been proved that E{t) = r. To find the variance of t we require U{t^), 
so consider S [c<»>P = ESWcyC,, 


E denoting summation over aU selections of the sample of n from.the finite parent population 

Tl 

of N members. Let us enumerate the number of ways in which CyCjji and similar products 
with ‘tied’ suffixes, such as CyC^;, occur in the sum. 

Biometrika 34 


14 



200 Significance of ranh correlations where parental correlation exists 

[N — 4\ 

(i) When i, j, k, I are all different the term Cy % may occur -with I _ ^ 1 selections of the 

[N - 4\ 

remaining members of the sample and the contribution of such terms to S is I _ ^ I 
S' meaning summation over all unequal values of i,j, k, I from 1 to N. 

(ii) The term similarly occurs in ^ j ways and there are four ways of tying one 

suffix, each of which gives the same contribution to S since c^j is symmetrical. The total 

( N—S\ ” 
n,— 3 ) 
fN~2\ 

(iii) Terms like similarly contribute 2| j to E, and all other terms 

. rv TT \ ^ “" / W 

zero since % = 0. Hence 

s [c»]> - (f :*) 

/N\ 

Expressing the S'’s in terms of the corresponding S’s and dividing out by I i we obtain 


are 


where = n{n— 1) (?i-r + 1). Since '^Cifyj - N(N — l) and Scj^c^; = c®, the variance of 

N 

t for given t and n is seen to depend on the value of Ec^vOjfc = Ecf, where = Ec^^. 

y-i 

Let N become large. The quantities c and Ecf are respectively 0(N^) and 0{N^), so if we 
introduce r^ = oJN the value of E{t^) for large N becomes 


E{iiF) ~ 


(^-2)(ri-3) 4 (w-2) Et| .2 

n{n~l) n(n — l) N ”^71(71—1)’ 


and hence m the limit the variance of t is 


vari = 


4(71 — 2) 
71(71— 1) 


varT^4 


2 

7i(7l— 1) 


(l-T^). 


( 8 - 1 ) 


9. The variance of t satisfies the inequality 

vari<-(l-T2), (9-1) 

n 

whatever the parent ranldng. Moreover, though the limit may not be attained in any 
particular parent ranking, reasons are given in the Appendix for expecting that it cannot 
be substantially improved upon. The proof is as follows. 

Reverting to a finite parent population of N members, we first seek a maximum for Ecf. 
In terms of the original scores, Cy = Keeping 6^^ = + 1, i J j, = 0, as before, allow 

the to assume any values subject to the conditions 

= N{N - 1), Eay6f,. = c = W(A - 1) r. 

The stationary values of Sc| occur when the a^jS satisfy the equations 

+ = 0 , 



201 


H. E. Daniels and M. G. Kendall 
which give, on multiplying by 6^^ and summing j. 

Thus, unless the c/s are all to be equal, in which case Sc| is a minimum, A and fi must take 
the values A = - 2, /i ^ cj{N - 1), 

and since 2I!c| - XN{N — 1 )-/«: = 0, 

it follows that Sc| cannot exceed \N{N -1){N -2) + \c^l{N - i). Allowing N to become 
large, this implies Srf/iV ^(1 + t®). 

Hence varr^:? J(l— t®), 

2 

and so from equation (8-1) vart<-(l— t®). (9-1) 


10. Assuming that the sample is large enough for the distribution of t to be normal, the 
roots Tj, Tg of the equation 


i.e. 



( 10 - 1 ) 

a / « V \ ^ / 

2x^\ 

( 10 - 2 ) 


provide confidence limits to r when t is known, x being the standardized normal deviate 
corresponding to a given probability of P %. These confidence limits are of course maxima, 
in the sense that we shall be wrong in ai most P % of the oases in asserting t to lie between 
the calculated limits. 

In our proof of the tendency of t to normality it will be necessary to neglect terms of order 
and the sample may have to be rather large for such terms to be small, unless t itself 
is small. 

The form of equation (9-1) suggests using 


w = sin-^i 


instead of i. To the same order of approximation we can take lo as having a normal dis- 
tribution with mean w = sin-^r and standard error not exceeding >J{2ln), which is indepen- 
dent of r. This form is more convenient for assigning confidence limits to t, and for testing 
the significance of the difference between and (whose standard error cannot exceed 
^[2{l/ni+l/n^)]), but we have not been able to discover whether the transformation brings 
the distribution nearer to normality. 


11. We now prove that the distribution of t tends for large n to normality whatever the 
parent ranking, provided that | r | is not near unity. 

Write gfy = Cij-cjN^ so that = 0, and g^ = -cjN^ = - (iV- 1 )t/A. The 

rth moment of about its mean value is so consider 

n n 

the summation S being over all possible sample selections. 


14-2 



202 Significance of rank correlations where parental correlation exists 

The argument used by Daniels ( 1 944) to show that in the null case the distribution of rank 
correlation in large samples tends to normality can be applied with little modification to the 
present problem. The proof is therefore sketched here without much detail. 

Two essential conditions to be satisf ed are that = 0, which is true by definition, and 

which is true.only if 1 -r^ = 0(1), so that the tendency to normality may 
be expected to break down for high correlations. 

The sum S is evaluated as in § 8 by counting the number of ways in which terms like 

n 

and similar terms with tied suffixes, occur. In this way it is expressed as a linear 
combination of etc. Every such S' is replaceable by the corresponding S 

together with terms containing more tied suffixes which are of lower order in N since they 
involve fewer summations from 1 to 


12. First consider the even moments with r — 2m. Terms containing more than 3m 
different suffixes must vanish, since in such cases it is impossible to avoid at least one 
with two free suffixes, and = 0. For the same reason the only non- vanishing terms with 
3m different suffixes are those containing expressions like 

and terms with fewer different suffixes are of correspondingly lower order in N. 


( N — 3m\ 

I ways of selecting the remaining n — 3m 
n — Sm j / 2 j 7 j,) 1 2 ?^. 

members of the sample, and the suffixes can be tied in ^ — ttoT^ to give the same result. 

/AT\ /N — 3m\ l/N\ ' 

Dividing out I ^ noting that | 3m) I \ n) ~ when both N and n are large, 

the contribution of such terms to /i 2 mi tbe 2mth moment of about its mean, is found to be 


n- 




which is of order re®”*. Moreover, by the same argument, terms with/< 3m different suffixes 
add contributions of order re^ which may be neglected. 

{2m ) ! 


Hence 


Hm' 


iV®” 


m! 




the neglected terms being relatively 0{n-^). 

13. For the odd moments let r = 2m-|-l. Similar considerations show that the non- 
vanishing terms of S cannot have more than 3m -f 1 different suffixes, and /< 2 nn-i is therefore 

n 

of order re®”*+®'. 

Then since d™)/re* has even moments of unit order and odd moments of order re“i, the odd 
moments may be neglected to that order. We conclude that c^"-) is distributed normally for 
large re with variance 

-^^9ij9ik = 4re®varT^, 

and t is similarly normal with variance (4/re) var t^. 

14. The fact that terms of order re“* have to be neglected suggests that the normal approxi- 
mation only holds good for fairly large samples. This is not surprising since one would expect 
skewness to be an important property of the distribution of t when t is not zero, if only for 
the reason that 1 1 1 can never exceed unity. It seems worth while to examine the odd moments 
in more detail. 



H. E, Daniels and M. G. Kendall 


203 


The dominant term of the (2m4-l)th moment has 3m + 1 different suffixes, which can 
occur as ^giMui^gmguwT-'^ or 

(2to+ 1)!2«™-«2» (2m + l)!2”‘+2 


Both can be obtained in 


distinct ways 


, and tfei 


ere are 


(2!)“-13!(ot-1)! 
\n~dm — l) 


3!(m-l)! 

ways of selecting the sample with 3m + 1 suffixes 


assigned. The (2 to+ l)th moment of about its mean is therefore 

(2m + 1) ! 2»»+2 

~ ^8?n+i 3 !(m— 1) ! ^^‘^i^'>-i‘gii'^^giigikgji\^gijgii^^~^i 

ignoring terms of relative order 0{n-''-). The corresponding moment of t is obtained to the 
same order on dividing by #”*+ 2 . it depends only on vari and where 






i^gugikgu+^gijgikgfl] = > 


N 


where == The distribution of t is thus specified to 0 («.~r)yjy gj.gj; three moments. 

3 = 1 

The moment-generating function of the distribution of t in standard measure is 


M[z) - + 


where Ti == /< 3 («)/(var /)♦ = 0{n-^), 

and the frequency, distribution of a: = {t~r)I^J{v&vt) is* 


f{^) 






-ia* 




(14-1) 


3!da:7V(27r)‘ 

15. The effect of the y-^ term in modifying the confidence limits based on normal theory 
can be seen in the following way. Let ^ be the normal deviate whose chance of being exceeded 
is P(g). The chance of x exceeding ^ is, from (14;-1), 

If X is the correct limit such that F{X) — P(Q, it is readily proved by successive approxima- 
tion that the formula v 

X = (15-1) 

gives the appropriate value of Z to 0(n-^). For example, the 6 and 1 % limits are respec- 
tively + l'96-i- 0-47 4yi and + 2-58-H0-941yi. 

16. In praetice the value of var i has to be estimated from the sample, and although its 
standard error can be shown to be 0{n~^) by the land of argument already used, it is not 
known how large the sample has to be before the error in estimating the variance can bc 
safely ignored. It is best to use the unbiased formula 


var< 


n{n— 1) (n — 2) [n 




2(2w-3) 
n(n— 1) 


c2 — 2n{n — 1) 


(16-1) 


(which is easily proved) in calculating vari from the sample, especially if the standard error 
of the mean value of t from a number of small samples is required. 

* Note that the approximation error in f{x) is relatively O(n-i), a stronger result than -vv-oulcl be 
obtained from a Gram-Charlier approximation based on the first three moments oiriy. 



204 Significance of rank correlations where 'parental correlation exists 

As the tern in is a small correction it is perhaps sufficient in moderate samples to take 

0 - h^gnigi+Sifi - |Sc,y(c^+c^)2 — (16-2) 

and /i 3 {t) = ~G, /iS)j{va.rt)i, (16-3) 

7b 

where the first term in Q is the sum of Cjj-(q + cff over all values of i >j. The unbiased formula- 
for involves some rather tedious computation. 

17. To illustrate the methods of the paper we consider an actual example. 

A set of thirty wool samples were visually graded in order of fibre fineness by three assessors. 
The mean fibre diameter for each wool sample was also determined by direct measurement. 
Table 1 shows the measured order (ilf) compared with that of the three assessors {A, B, G), 
in ascending order of experience. 


Table 1 


M 

A 

B 

G 

■ M 

A 

B 

0 

1 

6 

2 

1 


12 

14 

16 

2 

4 

6 

2 

17 

10 

18 

16 

3 

9 

6 

0 

18 

30 

21 

26 

4 

3 

1 

3 

19 

22 

26 

24 

6 

0 

7 

4 

20 

16 

22 

19 

6 

2 

4 

6 

21 

21 

16 

18 

7 

16 

19 

10 

22 

29 

20 

23 

8 

18 

3 

12 

23 

28 

25 

22 

9 

8 

8 

7 

24 

19 

27 

26 

10 

11 

9 

8 

26 

23 

28 

21 

11 

17 

13 

9 

26 

20 

23 

27 

12 

13 

10 

11 

27 

7 

24 

20 

13 

24 

17 

17 

28 

26 

29 

28 

14 

14 

12 

14 

29 

27 

16 

30 

16 

1 

11 

13 

30 

26 

30 

29 


The method of working will be seen from the. c^^ matrix for the MA correlation shown 
in Table 2, 

The correlations of the assessors’ orders with the measured order are found to be 
t4==0'490, = 0-724, to =0-816. 

(i) Consider first the maximum confidence limits given by (10-2). The 6 % limits are 
- 0-02 0-80, 0-23 <fa< 0-92, 0-34 <io< 0-96. 

Again, using the transformation w = sin-il, the 6 % limits are 

0-01 0-85, 0-30 <tjB< 0-97, 0-46 <io< 0-99. 

The values of w are = 0-612, Wjj = 0-810, = 0-954. 

The greatest difi'erence is 0-442, and the upper limit to its standard error is V(4:/w) = 0-365, 
so on these grounds the difference between A and 0 would not be judged significant. 

The 5 % limits are very wide, and the lack of significance is disappointing since G was 
known to be an expert appraiser while A is relatively inexperienced, and one would have 
expected an obvious difference between them. 




205 


H. E. Daniels and M. G. Kendall 

(ii) The variances estimated from the unbiased formula (16-1) are 

vari^ = 0-006630, var = 0-006067, vari^ = 0-002198. 
The estimated standard errors are therefore 

5^ = 0-081, = 0-071, sc,'= 0-047. 

The 5 % confidence limits, assuming normality, are 

0-33<G<0-65, 0‘58<ts<0-8Q, 0-72 <io< 0-91. 


Table 2 

Cii Cf 


0 

- 

+ 

- 

+ 

- 

-f 

4 

+ 

+ 

+ 

4- 

4- 

4 

- 

4- 

4- 

4 

4- 

4- 

+ 

4- 

4- 

4- 

4- 

+ 

+ 

4- 

4 

4- 

21 

- 

0 

+ 

— 

+ 

— 

4 

+ 

+ 

-f 

+ 

4 

+ 

4- 

- 

4 

4- 

4 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

21 

+ 

+ 

0 

— 

— 

— 

4 

4 

— 

+ 

+ 

4- 

4- 

4- 

_ 

4 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 


-j_ 

4- 

4- 

17 

- ■ 

- 

- 

0 

+ 

- 

+ 

+ 

+ 

+ 

+ 

4- 

4- 

4- 

- 

4 

4 

4 

4- 

4- 

4- 

4- 

4- 

4- 

+ 

+ 

4- 

4 

4 

4- 

19 

-t- 

+ 

- 

4 

0 

“ 

4 

4 

+ 

■f 

-f 

4- 

4- 

4 

- 

4- 

4- 

4 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

23 


- 

- 

— 

— 

0 

+ 

4 

■t 

+ 

+ 

4 

4- 

4- 

- 

4 

4- 

4- 

4- 

4- 

4- 

4- 

+ 

4- 

4- 

4- 

+ 

+ 

4- 

4- 

17 


+ 


+ 

-f 

-f 

0 

■f 

- 

— 

+ 

- 

4 

- 

_ 

- 

-- 

4 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

+ 

_ 

4- 

4- 

4- 

13 

-i- 

+ 

+ 

-t 

+ 

+ 

+ 

0 

- 

- 

- 

- 

4 

- 

- 

- 

— 

4 

4- 

— 

4- 

4- 

4- 

4- 

4- 

4- 


4- 

4- 

4- 

9 

+ 

+ 

- 

+ 

4 

-t- 

— 

- 

0 

4 

+ 

4- 

4 

4 

- 

4 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

- 

+ 

4- 

4- 

19 


+ 

+ 

+ 

+ 

+ 

— 

- 

-(- 

0 

+ 

4 

4- 

4- 


4- 

— 

4 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

- 

4- 

4- 

4- 

19 


+ 

+ 

+ 

+ 

+ 

-f 

_ 

4 

+ 

0 

- 

4 

- 

_ 

— 


4 

4- 

- 

4- 

4- 

4- 

4- 

4- 

4- 

_ 

4- 

4- 

4- 

13 


+ 

4 

+ 

+ 

+ 

— 


+ 

4- 

- 

0 

4 

4 

_ 



4- 

4- 

4- 

4- 

+ 

+ 

4- 


4- 


4- 

4 

4- 

16 

-1- 

+ 


+ 

+ 

-1- 

+ 

-f 

+ 

+ 

-f 

4- 

0 

_ 

— 

— 


4 


— 

« 

4- 

4- 

— 

— 

- 

_ 

4- 

4- 

4- 

7 

+ 

+ 

+ 

+ 

+ 

-f 

- 

- 

4 

+ 

- 

4 

- 

0 

— 



4 

4- 

4- 

4- 

'}■ 

4- 

+ 

4- 

4- 

« 

+ 

4- 

4- 

13 

- 

- 

- 

— 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

0 

4- 

+ 

4 

4- 

4- 

+ 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

r 


•f 

-t- 

+ 

4 

+ 

— 

- 

+ 

+ 

“ 


— 

- 

4- 

0 

— 

4 

4 

4- 

4- 

'4- 

4- 

4- 

4- 

4- 

- 

4- 

4- 

4- 

13 

+ 

+ 

+ 

+ 

+ 

4 

- 


4 

- 

- 

- 

- 

- 

4- 

— 

0 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

4- 

- 

4 

4- 

4 

11 

-f 

+ 

4 

+ 

+ 

-t- 

+ 

4 

+ 

-f 

+ 

4- 

4- 

4- 

4- 

4- 

4- 

0 

— 

- 

- 

— 

— 

- 

- 

- 

- 

- 

“ 

- 

5 

+ 

+ 

-f 

+ 

+ 

-f 

+ 

+ 

-1- 

+ 

-f 

4- 

- 

4- 

4- 

4- 

4- 


0 

— 

_ 

4- 

4- 

- 

4- 

- 

- 

4- 

4- 

4 

16 

+ 

+ 

-h 

-f 

4 

4 

-f 

- 

+ 

-f 

- 

4 

_ 

4- 

4- 

4- 

4- 

- 

— 

0 

4- 

4- 

"h 


4- 

4- 

- 

4- 

4- 

4- 

17 


+ 

+ 

+ 

+ 

-t- 

-+- 

4 

■f 

+ 

•f 

4- 

- 

4- 

4 

4- 

4- 

- 

- 

4- 

0 

4- 

4- 

- 

4- 

4- 

- 

4- 

4- 

4- 

19 

+ 

+ 

4 

+ 

4 

-f 

+ 

-f 

4 

+ 

+ 

4- 

4- 

4- 

4- 

4 

4- 

- 

4- 

4- 

4- 

0 

- 

- 

- 

- 

- 

- 

- 

- 

11 

4* 

4 

4 

4 

4 

4 

4- 

+ 

+ 

■f 


4- 

4 

4- 

4- 

4- 

4- 

— 

4- 

4- 

+ 

- 

0 

- 

- 


- 

- 

— 

- 

11 

+ 

+ 

4 

+ 

+ 

4 

-f 

4 

■f 

+ 

+ 

4- 

- 

4- 

4- 

4' 

' 4- 

- 

— 

4- 

- 

- 

- 

0 


+ 

- 

4 

4 

4- 

16 


4 

+ 

•f 

+ 

+ 

+ 

4 

+ 

+ 

+ 

4 

- 

4- 

4- 

4- 

4- 

- 

4- 

4- 

4- 

- 

- 

4- 

0 

- 

- 

4- 

4- 

4- 

17 


4 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

4 

4- 

4 

— 

4- 

4- 

4- 

4- 

— 

— 

4- 

4- 

— 

— 

4- 

- 

0 

- 

4- 

4- 

4- 

15 

+ 

+ 

- 

+ 

•f 

+ 

- 

— 

— 

— 

- 


- 

- 

4- 

-- 

— 




- 

_ 

- 

- 

- 

- 

0 

4- 

4- 

4- 

-11 

+ 

+ 

’4 

+ 

+ 

+ 

4 

+ 

4 

-f 

4- 

4- 

4- 

4- 

4- 

4- 

4 

— 

4- 

4- 

4- 

- 

- 

4- 

4- 

4- 

4- 

0 

4- 

- 

21 

4 

+ 

+ 

4 

4 

4 

+ 

4 

+ 

+ 

4- 

4 

4- 

4- 

4- 

4- 

4- 

- 

4- 

4- 

4- 

- 

- 

4- 

+ 

4- 

4- 

4- 

0 

- 

21 

+ 

+ 

+ 

4 

4 

4 

-f 

4 

+ 

+ 

4- 

4 

4- 

4- 

4- 

4- 

4- 


4 

4- 

4- 



4- 

4- 

+ 

4- 



0 

c= 

n— 

Sc? = 

19 

426 

30 

7470 


Moreover, we should j udge A and 0 to be significantly different at the 1 % level, and A and B 
at the 6 % level. How far these conclusions are valid depends, of course, on the accuracy of 
the variance estimates, but the conclusions seem to agree with what might have been 
expected from prior knowledge of the assessors’ capabilities. 

(iii) The values of calculated from (16-2) and (16-3) are 

yi(H) = -0-32, ri(H)=-0-36, 7i((7) = - 0-38. 

The distributions would not appear to be very skew, and the distribution of the difference 
of two t’s is probably nearly normal. The adjusted 5 % limits are, from (16-1), 
0-32<«^<0-e4, 0-57<«b<0-85, 0-72<«c<0-90. 



206 Significance of rank correlations where parental correlation exists 

APPENDIX 

1. The question arises whether a particular parental form exists for which the variance 
of t assumes the upper limit 2(1- T^)ln. We surmise, though we cannot prove, that the maxi- 
mum possible variance is attained when the parent ranking has a ‘ canonical ’ form obtained 
in the following way. Consider again the ranking 

62316789 4. 

The number of positive pairs S is 26, so that t = 0-44. Let us transform this so as to bring 
the 1 to the beginning of the ranlcing but move the 9 so as to preserve the number jS at 26. 
The 1 passes over three members to go to the beginning and hence adds 3 to the score. The 
9 must, therefore, proceed to the left over three numbers so as to subtract 3 from the score 
and we reach 15239678 4. 

Now operate similarly with 2 and 9, reaching 

12693678 4. 

Had our 9 been contiguous to the 1 and incapable of moving farther to the left we should 
have moved the 8 and so on. Proceeding with the process by moving hack the 3 and the 9 
and 8 we reach 12395687 4, 

and again 12349876 5. 

All the lower numbers 1 to 4 are in the right order and the remainder are in the inverse order. 
We call this ranking the ‘canonical’ order for given S (ox t). It is not always possible to 
reduce a given ranking to canonical order, but there cannot be more than one individual 
out of place. 

2 , Consider the effect of a series of transformations leading to the canonical form . The first 
process, that of moving 1 and 9, will increase the value of S for some samples involving 1 but 
not 9 (leaving the others unchanged), will decrease the value of S for some samples involving 
9 but not 1 (leaving the others unchanged), and will, in general, not alter those involving both 
1 and 9. Similarly for 2 and 8, and so on. The effect of the transformation is thus to increase 
the values of iS containing the lower numbers 1, 2, 3, etc., and to decrease those containing 
9, 8, 7, etc. These values of S are themselves, in the canonical form, the greatest or least as 
the case may be. Consequently the progress to the canonical form is accompanied by 
increases in the number of high values of S and increases in the number of lower values, and 
one might expect the spread of the distribution to tend to a maximum. In the example 
quoted, the distributions of S in samples of 3 for the successive rankings are: 


Values of S 

Frequencies / 

0 

2 

3 

3 

6 

10 

1 

16 

13 

16 

10 

— 

2 

34 

35 

29 

32 

40 

3 

33 


36 

36 

34 

Totals 

84 

IBi 

84 

84 

84 


The sums SfS are all equal to 182. The sums S/N* are respectively 448, 450, 456, 462 and 466, 
showing tile canonical ranking to have the largest variance of the five. 







H. E. Daniels and M. G. Kendall 


207 


3. There is, however, another way of carrying out this process. If the parent ranking is 
inverted, t becomes — t, but the variance of samples of n drawn from the inverted ranking 
remains the same, by symmetry. We may then reduce the inverted ranking to its canonical 
form and reinvert it so that its coefficient is again r. This ranking we call the inverse canonical 
form. It will be shown that for large N, when t > 0 the inverse canonical form yields a larger 
variance for t than the direct canonical form. 

Even in the example already quoted, the inverse canonical ranking (with one member 
out of place) is 34256789 1, 

which has a distribution 


Values of 8 

f 

0 

2 

1 

27 

2 

10 

3 

45 

Total 

84 


The sum S//S^ is now 472, which is greater than the previous maximum 466, 

4. Consider the canonical case when there are N members altogether, i? at the beginning 
in the right order, and N — Bin the inverse order. If we select n-j members from the B 
and j from the N-B the value of 3 for the sample of m is |w(n - 1 ) - ^ j - 1 ), and the relative 

frequency of = in(w- 1) - ^ is j ^)/ (w^) ‘ suppose that N tends to in- 
finity and BjN to the ratio p. The relative frequency of ?7 = 1) tends in the limit to 

(”)r-¥. 

where q ~ 1-p. The mean value of U is then 

I ii(i - 1 ) (”) 


and since 

we must have 9 = (4TA) 

The variance of V is var U ^n{n- l)pq^[nq - i(l - 3?)}, (4-2 A) 

and so vari Wpq%nq-Wi - 3Q')}M(»i- 1). (4-3 A) 

5. If now the inverted parent ranking is reduced to canonical form, giving ratios p', q' 
corresponding to p and q, we shall have 

q'=^lUl+r)] (6-1 A) 

and yarf = 16p’q'%m'-hi^'-W)}ln{n-l)- (5'2A) 

Then since 5^ + 3'® = 1, 

var^'-var^ = (g' — g) (1 -g) (1 -g')- (5-3 A) 


When T is positive, q' >q and var t' exceeds var t. 






[ 209 ] 

TESTING FOR NORMALITY 

By R. 0. GEARY, Cambridge, University Department of Applied Economics 

1. iNTBODtrCTIOSr 

The present communication, one of a series, has two main objectives: 

(1) To show that probabilities derived jtrom the well-known analyses of variance and 
other ‘ small sample ’ tables, which postulate universal normality, may differ seriously from 
the true probabilities when the universes are non-normal, even, in some oases, when the 
degree of non-normality is not considerable. 

(2) To determine the most efficient tests of normality from a wide field of alternative 
symmetrical tests. 

It may be useful to summarize very briefly previous work in so far as it is strictly relevant 
to this study. * The modern theory may be regarded as having been initiated by Karl Pearson 
who, in 1895, found the first approximation (i.e. to n-^) to the variances and covariance of 

and 6^ for samples drawn at random from any universe and, assuming that the and 
62 were distributed jointly with normal probability, constructed 'probability ellipses’ from 
which the probability of the same values occurring; had the universe, in fact, been normal, 
could be inferred very approximately. A considerable advance in moment determination 
was made by C. 0. Craig (1928). In 1929, R. A. Fisher, in inventing cumulants, simple func- 
tions of the sample moments, and formulating rules for finding their semi-invariants, 
developed incidentally a technique for expanding to several terms in l/« the moments of 
^bi and b^ when the universe was normal. This paper was followed soon after by another 
(1930), fundamental for all succeeding work on this subject, in which R. A. Fisher ingeniously 
applied combmatorial technique to the finding of exact values of the moments of normal 
„Jbi and b^, and gave inter alia the values of the second, fourth and sixth moments of >fb-^ 
and of the first three moments of b^. The fourth semi-invariant, together with many other 
normal semi-invariants of b^, was determined by J. Wishart in 1930, and a further advance 
in R. A. Fisher’s technique was made jointly by R. A. Fisher & J. Wishart in 1930. In 
1932 Joseph Pepper gave the eighth normal moment of ^61. Using R. A. Fisher’s rules 
C. T. Hsu and D. N. Lawley in 1940 gave the exact values for normal random samples of 
the fifth and sixth moments of b^. Using a method due to R. C. Geary (1933) (applying 
C. 0. Craig’s ideas (1928) to the normal problem), R. C. Geary & J. P. G. Worlledge have 
recently (1946) found the seventh moment of b^. 

So much for moment determination. In 1930, B. S. Pearson used appropriate Pearson-type 
curves, apphed to R. A. Fisher’s (1929) approximations of the semi-invariants, to find 
approximate frequency distributions of and b^. From the frequency distributions he 
computed a table of 1 % and 6 % probability points at intervals for n from 50 to 5000 for 
and for n from 100 to 6000 for bg- 

Since at the time the prospect seemed remote of determining the frequency of normal b^ 
on which reliance could be reposed for samples of moderate sizes, R. 0. Geary (1935)j‘ 
suggested that the ratio, a, ofmean deviation to standard deviation computed from the origin 

* An excellent accotmt of the development of monaent theory up to the year 1930 was given by 
J. Wishart (1930). 

t The author was informed by M. Fr^ehet that this teat was suggested by Bertrand, but has been 
unable to cheek the reference. 



210 Testing for normality 

might be used as a test of normality, and gave the 1 and 6 % probability points for this test 
at intervals for normal samples of 6-100. E. S. Pearson compared experimentally Geary’s 
test with and suggested, for samples so large that comparison could safely be made, that 
bi was probably somewhat more sensitive than a, a suggestion which will be examined 
theoretically in this communication. In 1936 also, R. 0. Geary showed that there was a ■ 
high (negative) correlation for normal samples between a(l) (see 3*1) and b^ for normal 
samples, and argued therefrom that the former should be nearly as efficient as b^. In 1936, 
R. C. Geary gave a table of 1, 5 and 10 % probability points of a(l) at intervals for samples 
of 11-1001. In 1938, a brochure by R. C. Geary & E. S. Pearson was published by the 
Biometrika Office entitled Tests of Normality, giving tables and diagrams of probability 
points of C6(l) and b^. There is considerable literature dealing with the effect of universal 
non-normality on the normal tests, mostly by way of particular numerical examples : a selec- 
tion. of papers on this subject is included in the list of references at the end of the paper. 

2. Eiteot of non-nobmauty 
(a) The z-test 

The effect of universal non-normality will first be considered in relation to the 2 ;-teat. 

and yj, ••■iVn" a-r® fwo independent samples drawn at random from the 
same universe (normal or non-normal) it is easy to show that, if 

n" — 1 s'* 

" - (2-1) 

* ' s {Vi-vr 

£-1 

then + (2.2) 

when both ml and n" are so large that terms in n' and n" of degree less than — 1 are regarded 
as negligible. This is an obvious generaUzation of the approximate formula given by 
R. A. Fisher* for normal samples, namely. 



It may be useful also to give formulae for the first and second moments from zero for z 
when the two random samples are drawn not necessarily from the same universes, though 
both universes have mean zero and the same variance : 



* Statistical Methods for Research Worhera, 8 th ed. p. 219. 



R. C. Geary 


211 

where the A’s indicate semi-invariants of the two universes of the orders indicated. In these 
formulae, in effect, terms to order — 2 in-tt', n" are retained. 

When both samples are large the frequency distribution of z will approach normality 
provided that is fimte. The effect of universal kurtosis can accordingly be assessed in a 
very rudimentary manner from (2-2) and (2-3). The z-deviate ^ corresponding to, say, the 
2i % normal probability point i. ^ 

If, however, the universe were not normal and had, in fact, a variance ilfa with /?2 + 3, the 
actual probability of a deviation in excess of ^ in absolute value would be, not 0-05, but the 
normal probability appropriate to a unit variance deviate of On this consideration 

the actual probabilities for different values of /? 2 , where the assumed probability is 0‘06, are 
shown in the fifth column of Table 1 . 


Table 1 . Effect on probability of z of change in universal kurtosis, for large samples 



mum ^ 


1-9600 

Actual 

probability 

1-6 

4 

2 

3-9200 

0-000080 

2 

2 

1-4142 

2-7718 

0-0066 

2-6 

1-3333 

1-1647 

2-2632 

0-024 

3 

1 

1 

1-9600 

0-060 

3-5 

0-8000 

0-8044 

1-7530 

0-080 

4 

0-6667 

0-8166 

1-6003 

0-110 

4-5 

0-6714 

0-7659 

1-4816 

0-138 

6 

0-6000 

0-7071 

1-3869 

0-166 

6-6 

0-4444 

0-6667 

1-3066 

0-191 

6 

0-4000 

0-6326 

1-2397 

0-216 


The table shows that, if the universe from which the samples are drawn has = 6, the 
true probability is about 1 in 6 instead of the assumed 1 in 20. It is, of course, true that 
universes with so large a kurtosis are unusual. This view cannot be held of the range 2- 6-4 
for P 2 . which the probability, assumed to be 0'05, can be anything, in fact, from 0-024 
to 0-110. Accordingly, if universal kurtosis is markedly negative, use of the standard table 
masks significant differences; if kurtosis is positive the standard table exaggerates these 
differences. Unless systematic tests have established that kurtosis is negligible the standard 
table should not be used for testing significant differences in variance. 

The foregoing analysis gives a theoretical explanation of the striking experimental results 
of E. S. Pearson (1931b) working, however, with a test function 

* = S (iBi-®)®/! S S {Vi-yf] 

{=1 / U-l ) 

and with sample sizes »' = 5 and n" = 20, smaller than those contemplated in the present 
analysis. With 500 samples Pearson showed that when the frequency at the two tails 
together expected from normal theory was 16*4 ( = probability 0-0308) the frequencies 
actually found in symmetrical universes with = 2-5, 4-1 and 7-1 respectively were 7, 39 
and 47, equivalent to probabilities of 0-014, 0-078 and 0-094. 




212 Testing for normality 

If tests of normality indicate universal kurtosis, either of two eonrses might be adopted: 

(i) Assume that z is normally distributed with variance computed from (2-2) with 
(^ 2 - 3 ) estimated as hjkl from the sample, and being R. A. Fisher’s (1929) cumulant 
functions. 

(ii) Enter the standard table, not with z computed from the samples but with z 
estimating as in (i). 

Both of these procedures are, of course, open to the objection that, unless the samples 
are extremely large the estimate of unlikely to be accurate; the real might be larger 
or smaller than the estimate. Any probabilistic inferences should accordingly be accepted 
with reserve. 

It is fortunate that the condition specified in the foregoing paragraphs, namely, that the 
numbers in the two samples are both large, rarely applies in practical applications. It more 
usually happens that the number of classes is small, whereas the number per class is relatively 
large. In this case E . S . Pearson ( 1 9 3 1 6 ) has shown the first approximation to cr| is independent 
of from which he inferred that the actual probability when the total numher of samples 
was large was inconsiderably influenced by kurtosis. In view of the foregoing analysis it 
seemed to the writer desirable to carry the inquiry a stage further. 

Suppose, then, that Ic samples are drawn at random from the same universe, in the jth 

sample, the total = n. It is assumed that n is so large that terms in n~^ are negligible, 
1 

that the number of samples k is small, and that all the Uj are of the same order of magnitude 
as n, i.e. that if * 

S7r^ = l. (2-6) 

3“1 

none of the tTj is negligibly small. 

Using B. A. Fisher’s cumulant notation with subscript to indicate the sample from which 
the oumulants were computed, the mean for the yth sample is written and its variance 
Then ^ 


where 


so that 


so that 


where 


(fc- 1)X = 

j Tv 

^ Z = £ 7r^(l -TTj) 2 £ £ 

(«-*)7=£(7i^-l)fc,,., 

A 


Without loss of generality let the universal mean be zero and the variance unity. It may 

easily be shown that „ • 

riZ — X = 1. 

Set = I = = {1 + (Z - 1)1 {1 + ( 7 - l)}-b 

Then = {1 + (Z- 1)}{1 - (7- 1) + (7- 1)^- (7- 1)3+ ...}, ) 

w3 = {1 + (Z-1)}2{1-2(7-1) + 3(7-1)2-4(7-1)3 + .’.}.| 



R. C. Geary 


213 


We shall compute the approximate values of Ew and Evfi, i.e. the. values to order n~^-, the 
symbol — denotes equal to, to approximation required’. From values of the variances 
and covariances given by E. S. Pearson (19316) in his equations (9)— (11), we have 


jS?(X-l)(7-l):^^, 


n 




A4 + 2 


n 


(2-9) 


with = S ^1- 

We require — 

+ 2 s S 1 “ *1^' - 4 S S 2 v,- V^.(l - 37r^) fey, k,y., 

17' } f j" 

7 - 1 = Z</)j{k 2 j- 1) = E^j-k'^p say, 
remembering that, by definition of oumulants, 


Ek^j = Aj = 1. 

Also ( y - 1 )" = 2 2 2 2 Ki Kr- 

i>f 

It will be useful for what follows to note that 


Using R. A. Fisher’s formulae (1929) for formation of joint semi-invariants of k^ and k^, 
and noting that the Ic samples are independent, we find from the foregoing 


n{h-l)EX{Y-\f^{k-l){X^ + 2), - 
n{Jc-lfBX\Y-\)^2{k‘^-l)Xp 
n{k-\f EX\Y -\f^{k^-l){X^ + ‘l).^ 
Then, from (2-8), (2-9), (2-10), 

2 

F/m=l-)--, 

n 

i. 4, 1 1 


(2-10) 


( 2 - 11 ) 


These are the formulae required. It will be noted 

(i) that the terms free of are independent of A4, which is equivalent to B. S. Pearson’s 

result (19316); 

(ii) that the formulae (2-11) agree with the normal values 


/ 2 2 
“'+S’ 

k+ll 2 W/ 4 \-i k+ll, 6\ 

= V-^) 

0 ; 

(iii) the approximations at (2'11) are free of Ag, 


to n~^ when A^ = 0 


(2-J2) 



214 Testing for normality 

The approximations at {2-11) tend to confirm B. S. Pearson’s result that, when n is large 
compared with k, the effect of universal kurtosis is unimportant. It would be useful, however, 
to compute the approximate true probability for different values of k, n, and a_i. Bor this 
and fox subsequent work the following lemma* will be found useful; 


If f{x) and ^{x) are two frequency densities with semi-invariants L„^ and L'^ (to = 1,2,.,.), 
respectively, then, formally, 




(2-13) 


U-l W! 

Por the present application take as generating function <f) the frequency distribution of w 
in the normal case, i.e. 


(j)(w) = 




and, from (2-11), 
Assume that 






(/c— 1) 

{n~k) 


-L'^- 


{k^-\-2k — 2 — a_i) A4 


n{k-l)^ 

L ^- L '^^0 (to + 2). 


(2-14) 

(2-15) 


Then if the ‘ normal theory ’ probability corresponding to the sample value w be p, the 
approximate ‘true’ probability, subject to (2-15), will be about {p+p'), where p' is given by 


p = 




{"(!>" {w)dw = 

Jw ^ 


(2-16) 


The termp', of course, merely corrects for the non-normal term in n-^ in the variance of z; 
it takes no account of corrections due to terms of higher (negative) orders in n or even of 
non-normal terms in n~^ in semi-invariants Ir„j (to> 2). The calculation is designed merely 
to show whether the standard table probability requires correction for universal kurtosis; 
this will appear if p' is of the order of magnitude of p. 


(6) The t-test 

In Geary’s 1936 paper the expansion to terms in n''^ of the fivat four moments of t, where 


t = n%lk\, (2-17) 

were given. Bollowing are the first six semi-invariants L of f to the same approximation as 
in the earlier paper: 


«,i(2 '*'16?!.^^'^® 2A5-(-6A3A4)|-1-..., 

+ i(8 + 7Ai)n-H (6- 2A4-fAi-^A3A5-t- Jj^AiA^) 

2 h,n~i - (9A3 - 3A5 -i-^Ag A4 + ^Ai) n-i, 

(6 - 2A4 -f 1 2Ai) + (64 - 1 8A4 -t 4Ae -b 75Ai - 63A3 Ag - 6A| + 8 1 Ai A4 -f n-\ 

— (6OA3- 6Ag— 2OA3A4-I- lOSAg)?!”*, 

I/6=(240 - I 2 OA 4 -t- 677^A| -b I 6 A 3 - 21 OA 3 Ag- I 5 OAIA 4 -bl 200At) 

Throughout this subsection we take q = q' n'\m 


(2-18) 


* Due to Charlier and termed the “Differential Series” by the Scandinavian School, 
t 1936 formula corrected. 



R. C. Geary 


215 

where the are the semi-invariants of the parent universe. For these expressions terms in 
TO"* are neglected. They were derived from the moments (from zero) M\ of t, which were 
obtained by the method described in the 1936 paper. It will be noted that, to the approxi- 
mation used, the expressions involve only the first six semi -invariants of the parent universe. 
When the parent universe is normal all the A^ (i > 2) are zero. The magnitude of the numerical 
coefficients in the foregoing approximate expressions for the indicate that, when the 
universal values of the A^, particularly those of uneven order, are not very small, the frequency 
distribution of t may differ appreciably from the classical Gosset-Fisher (190.8, 1926) dis- 
tribution. 

The formal Gram-Charlier expression for the frequency of t could, of course, be written 
down at once from (2- 18). It is doubtful, however, if the Gaussian can be regarded as the 
most appropriate generating function for the frequency of t because, even when the parent 
universe is normal, the semi-invariants of the higher even orders are large for moderate 

values of n. For example, 

= Qlin - 5), = 240/(n - 5) (n - 7). 

It is proposed to use (2- 13) for finding the approximate frequency with 

m = T(f, n) = 

the Gosset-Fisher frequency. Let 

/ \-in 

It can easily be shown that the rth derivative (in t) of is 


(2-19) 

( 2 - 20 ) 


T^{\t;n) = {-Y 


(•n-l-r — 1)! 


( to - 1 )!( to - 1 )' 




r(r— 1) 




r(r-l)(r-2)(r-3) 

2.4 




— to . 


r(r — 1) (r — 2) (r — 3) (r — 4) (r - 5) 
2.4.6 


K fi \ -Kn+Zr) 

1 +^J , (2.21) 


with 


TO— 1 {n — lY (to— 1)® 

(TO+l)(TO-f3)’ ’‘*~(TO-fl)(TO-(-3)(TO + 6)’ 


etc. 


Note that (2.21) assumes the Hermite form when to = oo. 

The theory will now be applied to particular examples using in all cases to = 10. The 
universes will be assumed to belong to the Karl Pearson system, so that (M. G. Kendall, 
1941) the values of A5 and Ag can be derived (given Ag and A4) from the following equations: 


(1 -t- 4^) Ag -f- 2g 

( 1 -f 6?/) A4 -|- S^Ag -t 61/ 

(1 -h 61 /) Ag -(- 4^A4 -1- 245/Ag 

(l-t-Vi;) Ag-H5gA5-)-10)/(4A4-|-3A|) = O.J 
From the first two equations 

7} = (2Ag-3A§)/(-10A4+12Ai-12), 


0,1 

0 , 

^ 0 , 


(2-22) 


which, substitiited in the first equation of (2-22), gives g. The values of ^ and tj, substituted 
in the third and fourth equations, give Ag and Ag. From (2- 1 8), the L'^ being the semi-invariants 
Bioraetrika 34 ^5 



216 Testing for normality 

when the parent universe is normal (i.e. the values found when all the A’s are set equal to 
zero), L^-L[^J^n-^ + K^n-K + 

L^~L'^= K^n-iX (2-23) 

The J and K are the terms in the A in (2- 1 8). To n"® (i.e. ignoring «.-*) the frequency generated 
from T of (2-19) is as follows: 

fit) = + + + 

I ns D® 

+ W-'* Z> 4--^ (-^'a + 3Ji/2 + Ji)4-y^ {K^ + 6J^J^+lQJ^Jg+10JlJf) 

+ ^4^(JA + 2</i JI) + { Y + g (K, + 

I)« 

+ 4 J 3 AT]^ + 3t7| + 6 Jf /a + Ji) + [K^ + ^JiK^ + + 16J^J^ 

Z)8 

+ BOJ^ J2J3+ 15 JIJ4+ 20Jxt^) + Yx 520 lOt/ld- SOJa/g 
nio t4 "I 

+ mM + 80 J! Ji) + ^ (3 Ji/4+ 4Ji Ji) + Z)is j , (2-24) 

.ith 


To n~^, (2*24) agrees with the formula given by M. S. Bartlett (1936), in which, however, 
there is a small and obvious slip in a sign. The law of formation of the numerical coefficients 
of (2-24) is evident; for instance, the numerical coefficient of D^J^Jl is 1/144“= 1/2! 31^21. 

The integrals f ,and [ (i>0) are found by reducing the exponent ofjDbyunity, as follows: 

Ji J —00 

f ^ Ddt = -T, = f * D^^dt = D®™-!, = - f”* D^”^+^dt = 

J —00 Jl J — 00 Ji J — 00 

(2-25) 

In normal theory the upper and lower 2| % points of t are ± 2-262 for w = 10. Table 2 
shows the ‘true’ probabilities, i.e. the value of 


f f(t}dt (2-26) 

for parent universes specified by Ag, A4, using (2-24). 

There are two observations to be made on the results presented in this table. The first is 
that, despite the considerable number of terms (shown at (2-24)) included in the probability 
expansion, the values found in the successive terms cannot be regarded as satisfactorily 
convergent for so small a sample as 10, and, of course, the convergence disimproves with 
increasing ‘.y//?]^. Taken all together, however, they seem consistent and significant. The second 
observation is that attention was confined to the negative ‘tail’ of the distribution. It may 
be assumed that, in all cases, the distortion would be very considerably less marked if 
regard were had to the probabiUty for 1 1 j > 2-262. Actually for universe 3 the probability 



R. C. Geaby 217 

is 0-066, not significantly different from tlie normal theory probability of 0-06. In justifica- 
tion of the attitude adopted above, the point might be put as follows; 

We decide to accept the hypothesis that the universal mean is zero provided that the value 
of t found from the particular sample satisfies ty where 

Prob {t < <o) = Prob {t > t^) = 0-025. 

The table is designed to show that if the parent universe is markedly asymmetrical the 
range {tg, tj) may differ appreciably from ~tg = t^ = 2-262. 

Table 2. Probabilities of t less than — 2‘2Q2for samples of 10 for seven universes 


Universe 

il 

A 4 — 3 

Probability 

Normal 

0 

0 

0-026 

2 

0 

1 

0-024 

3 

1/2 

0 

0-041 

4 

1/V2 

1/2 

0-047 

5 

1 

0 

0-072? 

0 

1 

1 

0-086? 

7 

1/2 

1/2 

0-043 


As anticipated by earlier work (W. S. Qosset, 1908; R. C. Geary, 1036), the table shows 
that the distortion is slight for symmetrical universes; even when A* = 1 (and Ag = 0) the 
probabihty (0-024) is practically identical with the normal value. There can be little doubt 
that the standard table probabilities can be seriously at variance with the true probabilities 
when the universes from which the samples are drawn are markedly asjTnmetrioal. 

(c) Difference of means 
R. A. Fisher’s (1925) test of significance 

iK-K}^|irl'+n''-2) I n'n" 

{(7i'-l)fc'-(-(»i"-l)fcJ}V + 

for the difference of averages h'-^ and fcj in normal theory for random samples numbering n' 
and n" is, of course, a particular case of the analysis of variance considered in § (a) above. 
The second cumulants are fcg and fcg. It is assumed that the unknown luiiversal means and 
variances are equal. Suppose now that the random samples in reality have been derived 
from universes in which the means are equal but the other semi-invariants A^ and A'^ are not 
necessarily zero for i ^ 2, or even necessarily equal. Since the universal means are assumed 
equal, without loss of generality we may take A^ = AJ = 0. This general mathematical model 
seems to be the correct one; we are not trying to determine the probability of the samples 
being derived from the same universe but rather if they could conceivably have been drawn 
from universes with the same arithmetic mean, however much they may differ otherwise. 
The correctness or otherwise of the concept may be considered in relation to, say, the 
problem of deciding from two random samples which of two types of fertilizer is to be pre- 
ferred from yield observations on a given crop on a given kind of land. Undoubtedly the 
prime problem will be that of ascertaining which is probably the better yielding (i.e. whether 
the arithmetic means are significantly different). Of considerably less importance is the 

15-2 




218 


Testing for normality 

question of which fertilizer is the more variable; of less importance still is the question of 
deciding, say, whether with approximately equal yields one universe is symmetrical and 
the other markedly asymmetrical. The point is that the question of the equality of universal 
means should be considered without assuming that the other semi -invariants in the universes 
from which the samples have been drawn are necessarily equal. This essentially is also the 
viewpoint in R. A. Fisher’s randomization method. 

Expanding the denominator of (2‘27) in terms of — and (fcg — A^) and computing 
therefrom the first few terms of the first four moments of t, we find the following approxima- 
tions to the first four semi-invariants : 


AL^^- 




A^L. 


2(7i'A^-t-m"A'')’ 


1-t 


2 n'X'f+n '%^' 
n'XL -f- w 


'-K ) 


•+- 






_____ 
n"^'' {n'X'^WX. 


K 


' n'n"{n'X‘^+n'’Xlf 
3(A;-A'') 




4(%'A'+n"A")2’ 


) As A"\ 


A^T A-\^ /A^ A- 

' ''' ■ 'ft' w'7 


\n'X’,+n'%f 

nK-Af IK , , a ; ai 

(n'X'2 + n"Kf \n' ’^n") 

JK A'A {X'An'n''X[ -f- - n'^X^) -|- Xlin'n"K^ + 

V n'n"{n'XJ^ + n''X!:^f 


iK-K) 

(w'A^-tri"A'^) 


(2-28) 


with 


’ In' + n"\ 

(A>'-l-tA''»i"-l)l 

l\ n'n" j 

{n'+n"-2) j 


Using formula (2- 24) to the term in with the Gosset-Fisher function again as generating 
function, Table 3 shows rough approximations, for four examples, to the ‘true’ probability 
of values of i < t, where r is the (negative) value for probability 0'025 from the normal table, 
and A^ = Aj = 1. When the two samples are drawn from different universes the distortion 
can accordingly be considerable. The third example suggests that if the universes are the 
same the distortion is small, a result to be anticipated from the fact (apparent from (2-28)) 
that, to the approximation used, the first two semi-invariants are equal to their normal 
theory values; this theory confirms the experimental results of E. S. Pearson & N. K. Adyan- 
thaya (1929). 


Table 3 


Example 

n' 


XI 

xs 

Ai 

AI 

Probability 

1 

12 

4 

1 

-1 

1 

-1 

0-045 

2 

18 

6 

1 

-1 

1 

-1 


3 

7 

4 

1/V2 

1/V2 

1/2 

1/2 


4 

10 

6 

1 

0 



1 





B. C. Geary 219 

It should be remarked that the probabilities in Table 3 (as well as in Table 2) are merely 
rough approximations — ^the samples used are far too small for the results to have any preten- 
sion to accuracy. The object has been merely to show that the actual probability could be 
considerably at variance with that shown in the standard table, for small samples. ■ 


3. SUE'I'ICIENT CONDITIONS FOB, APBBOACH TO NOBMADITY OF a(c) WITH INCEBASING n 

The remainder of the paper deals with the field of symmetrical tests of normality, homo- 
geneous of degree zero, represented by (3-1). It is essential to establish the conditions of 
approach to normality of the frequency distribution of a{c) as the sample number increases. 

Let a(c) = i SJ (3-1) 


where x = Ex^jn and o is non-negative. It will be shown in succession that, subject to stated 
conditions, with increasing n, 

(i) the frequency distribution of 


ax{c,) = \E\x. 




Ex^, 


ic 


(3-2) 


tends towards normality, and 

(ii) the frequency distribution of ai{c) tends towards that of a(c) and hence towards 
normality. 

It is assumed, without loss of generality, that the universal mean of the universe from 
which the sample of n is drawn is zero. Denote the ^th absolute moment from zero by /«|*|, 
k not being necessarily an integer. Given a positive quantity e arbitrarily small, w(e) can be 
found so that 


Prob 


Prob 


]E{\Xi I® /t|g|) 


-E{xl~/i^) 

Iv 


< 0 ) 


V n 


> 1 — e. 


(3-3) 

(3-4) 


provided, of course, that exist. As n increases w may be envisaged as approaching 

the normal probability point appropriate to the probability e, since, in the conditions stated, 
E I x^ |“/n and Ex^n are normally distributed in the limit. For samples which satisfy the 
inequality in the brackets { } at (3-4) and if u is so large that 

the denominator of (3-2) can be expanded to three terms (including the remainder) by 
Taylor’s theorem, so that a-y{c) may be written 

with 2/i == 

/ l-Kc+4) 


(0<6»<1). 



220 Testing for normality 

With probability exceeding (1 - e) it is evident, from (3-4), that X is maximized by 

\ /K2V «■/ 

It will suffice, for the present purpose, to mfer that 

lZl<x, 

where /c is a constant independent of n. We have now 


Set 




1 lyWiaei C/4 |c+ 2| /c 

wl/^loi ho\F2 \2 /)’ 

(3-6) 

and 


(3-7) 

with 


(3-8) 


For samples which satisfy the inequalities in { } at (3-3) and {3-4) and hence with a pro- 
bability exceeding (1 - 2e), we have 

I ^ I ^g^ V(/^iaoi/^4) , + feUl (3-9) 

' ' 2o- 8cr ny\ \ /t|oiV n j fn 

where-^ is independent of n. Or, briefly, 

Prob[|w|<^j>L-2e, (3-10) 

so that u tends in probability towards zero with l/n. Now (3-7) may be written in the form 
u = Y' — Y, where Y' and Y are the respective terms on the left side. If A be any number 
and F the total probability function, a well-known lemma (Frechet, 1937, p. 164) shows that 

1 FAA)-FAA) 1 < + J,) -|i)) + 

using (3- 10). Hence the frequency distribution of 

^ \ Mm I 

tends towards that of Y = — (3‘13) 

at every continuity point of the latter frequency, as n tends towards infinity. But Y, from 
(3-13), is the simple average of n random measures, and its frequency must tend towards 
normality provided that its standard deviation exists; from (3-6) it is evident that cr is 
finite provided that where k is the greater of 2c and 4, is finite. Here and in the remainder 
of this section it will be useful to remember that if exists so does /tij.'! for O^k' ^k. 



R. C. Geaby 221 

To prove that the frequency distribution of a(c) tends toTvards that of ^^(c) and hence 
towards normality with increasing n it will be shown that Z\xi~x\<‘ln tends in probability 
towards 2 j Xi Two cases will be considered separately': (1) c ?= 1, (2) 1 >c > 0. 

Case (1). 1 

For values of for which | ] > | ® f, 

ja:^ i« = ± c» | a:^ - dS jo-i (0<6I<1) 
andwhen |a:i|<|«|- ||.a:t-i]®-|a:J«|g{2‘!+l) |®|=. 

Hence ^ S (| |)“-(| h < 1 i | (| 2 | a;^ |«-i + C| * , (3-14) 

a 1-1 V'M' 1-1 / 

B and G being independent of the and n but depending on c. With e arbitrarily 
0 ) can be found so that 

Prob|lSl<oy^|>l^e, 

Prob { I ^ i7(l a:, I <ci > i _e. 

Hence, from (3-14) and (3'15), if ju,^ and /t|ao_ 2 | exist, 

Prob( -2\xi~x\<‘--S\xA<= 

1 ft ' ' ft ' ■sjn j 

for ft sufficiently large the constant B' depending on c but not on ft. Hence for c^l, 
i7| — « |‘=/ft tends in probability towards j®/ft. Incidentally, this proves that 

{2!{Xi — S)2/?i}io tends in probability towards {ihjl/ft}^ the latter two expressions representing 
respectively the denominators of a{c} and ai(c). 

Case {2). l>c>0 

Let X satisfy a probabilistic inequality identical in form with the first equation of (3' 15) 
and let y be any positive quantity, fixed once for all. Let n (presently to be defined further) 
be so large that 

y>(i) 

Then ^ I S' + r (3-16) 

ft i-l ft\|ailSy \xt\<y/ 

When I I ^ y.(i.e- in S'), 

\x^ — xY~\x^Y — '^cx\x^ — 6x (0 < d < 1), 

so that Prob | |a:^ — » j® — | a;^ )® | <cw^y— w (3‘1'^) 



(3-15) 


When I .a:^ I < 7 (i.e. in S"), given 7j arbitrarily small and positive, n can be found so that 

||a;i-»|®-(a:,j|«l<i/, (3-18) 


when 


a M 

V ft 


since | a: j® (c> 0) is uniformly continuous in E" . We then have 

Prob{lja:^— «|®“ ja:^ |®J <^}> 1 — e. 


(3-19) 



222 


Testing for normality 


Combining (3-17) and (3-19), it may be inferred that 


Prob 


n' n ' ' 


<m{y~o)J^ y<^ + 9/j>l-2e, (3. 


20 ) 


the first term of the upper limit in { } tending to zero as n tends towards infinity, and e and i) 
being arbitrarily small 

We have accordingly shown that the numerator and denominator of a(c) tends in pro- 
bability towards those of ( 2 i(c). Hence a{c) tends in probability towards «i(c). Hence, using 
the lemma cited at (3-11), the total frequency of a{c) tends towards that of aj^(c) which tends 
towards normality as n tends towards infinity. Finally; 

//c^O the, frequency distribution of a{c), given by (3-1), tends towards normality as n tends 
towards infinity provided that' where Je is the greater of 2c and 4, is finite. 

It seems likely that an analogous theorem can be proved for 0 > c> — ^ ; we shall not, how- 
ever, be concerned in this communication with negative values of c. 


4. Moments oe a{e) eob normal samples 

While it will be shown in later sections that, with indefinitely large samples, ,sjb^ and are 
the most efficient tests of asymmetry and kurtosis, respectively, it by no means follows that 
other tests are inefficient or that they may not be useful supplements in cases in which the 
prime tests are indecisive as to the probable non-normality of a given sample. It is accord- 
ingly proposed to give here close approximations to the first four moments (from the origin) 
of a(c) (given by (3*1)) for normal random samples of n. 

For normal samples (R. A. Fisher, 1929; R. C. Geary, 1933) 

/in I 1 1 n I Jcfe 

M',{a{c)} = E[a{c)Y = ij J (4.1) 

The exact value of the denominator is, of course, known, for 


E 


1 lift' 

=(— ) 



since, as usual, (n — 1) s® = I!{Xi — xf . It wiU be useful to expand logg Es^' with k' 
Stirling’s formula in (4-2) : 


(4-2) 
ck using 


log.®.- - 

_{k'^-Jk') k'{k'~l){k'-2) k'^(k'-2fi k'(k' -l){k’ ~2){W^~Gk' -4.) . 
4(n- 1) 12(«.-1)2 '*'24(ji-l)3 . 120(n-l)* 

k'^{k'-2)^k'^-2k'-2) k'{k'-l){k'-2){3k'^-m'^ + 24k' + 16) 
60(w-l)® 252(?i-l)6 


k'\k' ~ 2)2 (3Jb'^ - 12i'»- 4ifc'2 + 32Jfc' -t- 32) 
336(»~T)7 ’ 

which checks for = 1 to (»i- I)-’ with Geary (1935, p. 364). Take 


(4-3) 


1 " 
= - 2 


n 




(4-4) 


with 


Zt = x^ — X. 



R. C. Geaky 


223 


The moments of v{c) will be found exactly as in the case of c = 1 (Geary, 1936) from the 
single or joint normal frequency distributions of ( 2 q, 2 jj, ...). We find 


.n~\ 


X 1 + 


(2c+l)(c + l) ^ (2c + 3) (2c+ 1) (cH- 3) (c + 1) 

r “ — 


21 ( 71 - 1)2 


4!(n-l)i 


+ ....L 




'2<!\} 


{n - 3)h2<5+2) (jj _ 2 )-hfc-H) — 1 ) jj-f 




1+ 


3(c + l)2 
2(w-2)2 


(c+l)3 (c+l)2(c + 3)(7c+9) (c+3)‘‘(c+l)2 
(?j,-2)2'^ 8(n-2)* 2(?i-2)5 

(c + 3)2(c+l)2(61c2 + 310c+265) ) 

240(?i-2)« ■^"•r 

Similarly, for the fourth moment, 

mm - wr = *1 1- 1 h i‘ 


+ 


3?i(w-l) 


71* 


E\z^\^\z^\^+ 


671(71 — 1) (?t - 2) 


E\z^\^\z^\'=\z^ 


with 


c,. 


Tv 

= Gj + Gj + Cg + Gj+Gg 


22fc+l) 


(ft_2)l(4<=+«(?J-l)-2»7i-» 




,, , (3c+l)(c + l) , (3c + 3)(3c+l)( c+3) (c + 1) 
21 ( 77 - 1 )* 41 ( 77 - 1 )* 


+ ... 

(4-6) 


1 /277— 1\^®/C — 1\ 

(— )>• 

, 1 (277- l)°/2c- 1\ . 2'’ ,, r/c-i\ 1* 

For the third moment we write 

Mm) - ■»«'»’ = 5 ^ I ‘> 1 1“ + * I *< I- K I ,, I 

= Ai+A^+As, 

denoting the three terms on the right by A^, A^., Ag respectively. Then 

^1 = ^ (-^) ! (2 77-4<*+*<», 

!(”)l(^“2)«3^+«(«-l)-«“r7-» 


*2 1 Pal 


(4-7) 


( 4 . 8 ) 


+ .. 



224 


Testing for normality 
3.22« ; „ 4r/2c-l\.1^/, , (2c + l)® {2 c + 3)2(2c + 1)2 ) 

Cs =— jlj 4!(n-iF +‘-j’ 


3 . 22C+1 
TT* 


{n ~ 3)^+1 {n 2)-«*=+« (n - 1) ! (^)] ‘ 

f. , (c+l)(Sc+3) (c + l)®(2c+l) (c + 1) (57c» + 227c® + 265c +81) 

2(«-2)® (»-2)» 24(w-2)4 


(2c + l)(c+l)®(c + 3)(6c + 9) 


6(n-2)*i 


4“ . . . 


G^ = ^{n- 4)lW'>+5) _ 3)-2o-i (« _ 2) (w - 1 ) l]* 


+ 


, ,3(c+l)® 4(c+l)» {c+l)®(7c® + 21c+16) 4(c+ 3) (c+ 1)® (2c + 3) 


+ 


(c+ 3) (c + 1)2 (122c2 + 67 lc2 + 1070c + 526) 
16(w-3)® 


Formulae (4-5), (4-6), (4*7) and (4*8) were checked from the corresponding formulae for 
c = 1 given in the author’s 1936 paper. 

From the following section it will be apparent that for indefinitely large samples the most 
sensitive test of kurtosis of the field a{c) is found for c = 4. At the same time it is shown that 
there is really not much difference in efficiency for values of c in the range 5 > c> 2 ; moreover, 
the results in § 6 (in which the efficiency of the tests for c = 4 and c = 1 are compared from 
the power function viewpoint) suggest that, for samples of moderate size, the superiority, 
if any at all, of a test using a(4) = over other tests in the series may be even less marked. 
.The disadvantage of o(4) is that its frequency is not known for samples of all sizes ; and if we 
could estimate, with any degree of confidence, the probability points of a{c) for any value or 
values of c> 2 for medium-size samples we might, for practical purposes, dispense with a(4) 
altogether, since, while we now know one way of solving the problem of determining the 
exact, or almost exact, frequency distribution of o(4), it must be admitted that the method 
is extremely tedious. (From the theoretical point of view, however, the a(4) problem must 
be solved since it remains a challenge to the mathematical skill of statisticians!) It will 
accordingly be of interest to study the order of magnitude of the semi-invariants of a(c) 
for c near 2. 

Consider the case, for example, of c = 2-4, not by any means, it is important to 
observe, the lowest value which would be used for tabulating. In Table 4 the first three 
moments are given for n = 26. The L’s represent, of course, the semi-invariants. The values 
of the functions for a^{c) (given by (3'2)) *for » = 24 (i.e. the appropriate number of degrees 
of freedom for comparison with a(c)) are also given. These show that the moments of a]^(c) 
are very close to those of a{c ) , which suggests that, when n is not less than, say, 20, the values 
of Bi, and corresponding functions of higher orders, if required, for ^^(c) could be used 
for the determination of the probability points of a(c). This is important from the com- 
putational point of view because the algebraic expressions for the norma,! moments of 
are exceedingly simple whereas it must be conceded that (4-8) offers a grim prospect for the 
computer; furthermore, the principal.term 0^ is rather slowly convergent unless n > 50 or so, 



R. C. Geaby 226 

■whereas exact values for all values of n can readily be found for the moments of ai(c) for normal 
samples. 


Table 4. Normal moments, etc., of a{c) and af,c)for c = 2-4 



0(2-4) 

«i(2-4) 

n 

25 

24 

M[-Li 

1-166262 

1-1662624891 

Mi 

1-362004 

1-362091186 

Mi 

1-692841 

1-693161616 

-^2 “-^3 

0-001860 

0-001946318 


0-000063 

0-000069683 

VBi=£3/ht 

0-80 

0-8104 


As with (4-1) for a(c), the moments (from the origin) of any order of ^^(c) is the quotient 
of the moments of the same order for numerator and denominator, aHHiTming that the 
universal mean is zero and the variance unity. Since the different members of the sample 
are independent — the diflfioulty with a(c) is that the {x^-x) are not independent — for the 
moments of the numerator of (3-2) we require only 


E\x\’‘' 

and for the denominator 


V(2^) 


- (V) 


'n + k' — 2 




(4-9) 

(4-10) 


The case of c = 4 is particularly simple. The first four semi-invariants are as follows: 




= M{ 


3w 

(ti-)- 2) ’ 


Aj — — 


24?i® — 1) 

(7^-^2)2 {n+i) (»-f-6)’ 


r - M 1728(7^-1) (71- 2) 

® ^ * (n + 2)3 {n + 4) (to 6) (to -t- 8) (to-1- 10) ’ 

10,368to\to- 1) (30to*H- 168to*- OOSto®- 2672TO-f 3712) 
“ (to -f 2)* (to -t- 4)3 (to -h 6)3 (to q- 8) (to 4- 10) (to -H 1 2) (to H- 14) ■ 


(4-11) 


Moments, etc., for 0 ^( 0 ) for normal samples of 24 and 50 are contrasted for c = 2-4 and 
0 = 4 in Table 6. The contrast between the values of aJB^ and (.Bj,— 3) respectively for 
aj(24) and ai(4) is striking in the extreme. Even for to = 24 [ai(2'4)] and B^ [a^(2'4:)] 

are approaching the values at which a Gram-Charlier approximation to the frequency 
distribution may be reasonably convergent. Furthermore, the deoUne ia the values of the 
B’s from to = 24 to to = -50 is marked for a^(2‘4), while the deoHne in the JS[ai(4)] is very slow. 

It is accordingly suggested that a table of probability points (perhaps O-OOl, 0-01, 0'026, 
0-05 and 0-10) of a{c), for c equal to, say, 2-2, be prepared for to ^ 26 on the assumption that 
Gram-CharHer Qipplies throughout. For this purpose the values of the mean and variances 
for TO at intervals of, say, 10 should be computed from formulae G‘6) and (4-6); the and 
~ 3) should, however, be computed as for ai(c). For lower sample sizes it might be well 




226 


Testing for normality 

to use terms to order n~^ which would render necessary the use of the fifth and sixth semi- 
invariants of ai(c). The formulae given by E. A. Cornish & B. A. Eisher (1937) (assuming 
Gram-Charlier) could be used to find the probability points. On account of the minuteness 
of the variance L^, for c near 2 it will be necessary to work to many places of decimals — at 
least 10. As stated at the outset, the test of kurtoais a(2-2) will be only slightly less elfieient 
than a(4:) and it may be slightly more efficient than a(l), the probability points of which are 
known approximately for samples of all sizes. In any case the a(2-2) table would be a useful 
adjunct to that of a(l). 


Table 5. Normal moments^ etc., of a.fc) for c = 2-4 and c = 4 



n = 24 

91=60 

C=:2'4 

11 

c=2-4 

c — i 

Mi~Li 

VBi=£3/i| 

B,-i=LJLl 

1-1662624891 

0-001946318 

0-000069583 

0-000004921 

0- 8104 

1- 30 

2-769231 

0-569932 

0- 752488 

1- 956999 
1-7960 

6-24 

1-1721603127 

0-001058462 

0-000022261 

0-000000919 

0-6462 

0-82 

2-884616 

0-369660 

0-343337 

0- 711375 

1- 5926 

6-60 


In an earlier paper (1936) the writer suggested that the correlation between and a(l) 
for normal samples gave some indication of the relative efficiency of these two tests of 
normality. In this order of ideas it seems desirable to compute the approximate value of 
the correlation coefficient between a(c) and a{e'), where o and c' are any two positive con- 
stants. In the first instance the universe from which the sample of n was drawn was not 
necessarily normal. Since in the present application we will be concerned only with large 
samples we assume the universal mean known (and accordingly it may be taken as zero, 
i.e. Ai = 0), so that, instead of a{c) we use, in reality, ai(c) given by (3'2). In the remainder 
of this section we write a for a^^lc) and a' for a^lc') ; 


Set 


Then 



a 


\ic 


a' 

II 

\io' 

Zaf) . 

yi = 


Alol)//*le|j 


y'i = 




Zi = 

{x\— fifii Hz, 


a = 




(7- 

c-)-c' 

^ CiC + l){C+2)...{0 + k-l) 

2 ’ 


fc! 

■ aa' 

) = ^1 + 

1 ^ \ / 1 ^ A i 

1 X-Kc+o') 

aa 


n V 


(4-12) 

(4.13) 


(4-14) 


(4-16) 




R. C. Gbary 227 

The mean value of act jcccc was found approximately (i.e. to terms in n~^) by formally 
expanding the last factor in (4-15), multiplying by the first two factors, and setting down 
the mean value term by term, so that 

= Eaa'jaa'^ 

+ -^Gi + — 2 — ^ n ~n^ - 1 Ez^Ez^) 

+ ^ 6 \ ~ (nEyz 4- nEy'z ) + ^ {nEyz^ + nEy'z ^ ) 

n 

- -| [n{Eyz^ + Ey'z^) + 3?i - 1 Ez\Eyz + Ey'z)] 

'fit 


Q 

+ ^ [ 4?1 71 - 1 Ez^{Eyz + Ey'z) + 6% n - 1 Ez\Eyz^ + Ey'z^)] 


30(7, nn — ln — 2 




Eh\Eyz + Ey'z ) } + l^nEyy' - ^ nEyy', 




(j 

+ -| [nEyy’z^ + 2n m - 1 Eyz Ey'z + %n-l Eyy' Ez^] 

ft 

Q 

— I [w Ji — 1 Eyy' Ez^ + 3n n — 1 {Eyz^Ey'z + Ey'z^Eyz + Eyy'zEz^)] 


+ ^ ^ ' 2 ^ — “ Eyy'EH^ +12«.ji-lw — 2 EyzEy'zEz^^ . (4- 16) 

The E’s in (4*16) are readily calculable from (4-14), e.g. 

Eyy' = ^ViVi = -^(1 |°~/^ici) (| I' ■"/‘ic'i)Mici/*|[!'i = (/®jc+c'i//*ioi/*ic'i) ~ ^• 


It has been verified that when c is substituted for c' in (4-13) the formula agrees with that 
for the second moment of a^{c) given in § 6 . 

The coefficient of correlation is, of course, 


= MMMM^ (4‘17) 

with = 

Formulae for the first and second moments, to the approximation required, for the com- 
putation of (4-17) are given in § 6 . 

As an application, the following are the values of the variances and the covariance for the 
test of normality a(l) and (hj), i.e. in which c and c' have respectively the values 1 and 4, 
and where the universe belongs to the Pearson system with Ag = 1 , A 3 = 0 and A^ = ^: 


JIC 0-09313705 0-262961 0-196477 


iAc\ 

^c’c' 


n 

4-4286 




n 

0-491 


92-2 5 

4-87 

H Y' 


-■f- 


w 

831-2 

’ 

281-6 


n' 


3 




(4-18) 


/‘lo|/*lc'l 


n 



228 


Testing for normality 


From (4-17) and (4-18), jBcc'(«'== ^00)=;:^ -0-826 and Rc^in=co) = -0-764. It is of great 
interest to find that, though the universe is markedly non-normal the correlation for in- 
definitely large samples is practically identical with the normal theory value of -0-767 
(Geary, 1936), another indication, no doubt, that normal theory inferences can usually be 
applied with confidence when the parent universe is not markedly unsymmetrioal. 

When samples are indefinitely large we find, from (4-16) and (4-17), 

T> ^/^Ic+c'l " 2 (C/M|c1/*|o'+21 + c'/Mic'|/<|c+2|) + - C - 2 . c' — 2) y \ c \ y \ c '\ 


where, of course, the values to be taken here for and are found by substituting 
respectively c' for c and c for c' in the numerator. When, in addition, the parent universe is 
normal, we find 


R®, 


1 

(c-t-c'-l) 
1 2 

l!>-| 



(^1 

1 

A 

7/2c-l^ 

il 2 J 

|!V^-| 


rr))r;') 


->)‘l 

m)] 


(4-20) 


which reduces to - l/Ay{12(7r-3)} for c = l,c' = 4, as it should (Geary, 1935). The following 
section will accord (i.e. o(4)) a decided primacy amongst tests of normality when the 
samples are indefinitely large. It may, therefore, be of interest to give the values of the 
correlation coefficients (for indefinitely large normal samples) between and a{c) for 
selected values of c (Table 6). The table suggests, in the high coefficients of correlation, 
except for o very near 0 or 2, that all the a(c) should be reliable tests of kurtosis, with no great 
difference between their efficiencies. The efficiency of any two tests would be identical, in 
the conditions stated, if the coefficient of correlation between them was ± 1 because then, 
of course, they would be functionally, and not stochastically, related. 


Table 6. Correlation between and a{c) for indefinitely large normal samples 


Value of c 

Value of 

Value of c 

Value of 

0 

0 

3 

0-980 

1 

-0-769 

4 

1 

2 

0 

6 

0-983 

2-2 

0-887 

6 

0-939 

2-5 

0-962 

00 

0 


5. The most eeeioient tests for irdepinitbly large samples 

In this section we consider the efficiency of tests of kurtosis and asymmetry from the view- 
point of indefinitely large samples. 

By definition a test will be regarded as valid, in relation to a field of continuous alternative 
universes including the normal, if its value for infinite samples drawn at random from the 
normal universe is different from its value for infinite samples from other universes of the 
field. As the sample number increases the test will become increasingly discriminatory of 
the normal as distinct from other universes of the field. This increased sensitivity might be 
given mathematical expression in some such terms as the following: given a probability a 
(say 0-01), the normal universe of the field and any other distribution of the field. 




R. C. Geaey 229 

a number can be found so that for »>». tbe mean value of the tot fauotion for aampfes 

otufromr, will beat or beyond theaprobabiUty point ofthe tot function for samples of» 

from Wo ■ the smaller the more sensitive the test. 

We conaider, then, the inlmite field of alternative teats of kuiloaia represented by (3-1) 
when 0 ^aunrea all posjtove values, and the infinite field of .alternative universe, represe^d 
by the Gram-Charlier frequency ^ 




The universal variance is assumed to be unity, without loss of generality. The normal 
universe is a member of the field: it is found when aU the A, (i > 2) are zero. We assume that 
the conditions of §2 are satisfied so that for indefinitely large samples the frequency dis- 
tribution of n(c) for all parent universes is normal. Obviously the efficiency of any particular 
test (i.e. a{c) for a particular value of c) in regard to the normal and a particular non-normal 
alternative (i.e. a Gram-Charher frequency with particular values of the A.) will be adjudged 
by considering the ratio of ^ 

(i) the difference between the universal mean values of a{c) for the normal and the 
particular non-normal parent universes; to 

(ii) the standard deviation of o,(c) for indefinitely large normal samples. 

The most efficient test will be a(c) for c a theoretically ascertainable function of the given 
Aj which makes the ratio a maximum. 

For indefinitely large samples the mean value 0 of a(c) when the parent universe is given 
by(S'l)is , _ ...... 


J- . I (? ff ( -|)1 (5-2) 

Obviously J d* | a: | e-t^* = 0. 

Also, when ra ^ 1, 

a result readily inferable from the obvious fact that the left side vanishes for c = 0, 2, .... 
2m — 2. Accordingly 

The normal value is given by the first term. 

From (4-3), (4-6) and (4'6) it is evident that the value of the standard deviation, for larger 
normal samples (retaining only n~i) is 


^{nn) 




The principal term in the deviation 0 — 0o (where 0° is the normal value), from (6-4), is 


^(c-l)!2>° A 4 c(c- 2 ) 
-Jn ■ 24 ■ 


( 6 - 6 ) 



230 


Testing for normality 


To a constant factor, the ratio djcr is given by ih.Q first discriminant 


It will now be shown that 


p{c) = c{c-2) 

dp(c) 




+ 2 


dc 


= 0 for c = 4. 


The discriminant may be written in the form 


(5-7) 



p(«)-c(c-2)|^ 2 ) • 


( 6 - 8 ) 

where 

./<, = J COS® did. 


(5-9) 

and 

p'ic) 11 1 Ifij. -4cJ"c\ \ / 

^'c+c-2 2rl +/„ n) 

12®/,, C2-I-21 

\lo 2 )■ 

{5-10) 

From (5-9) 

/(. = /'=[ dd log® 6 log cos Q. 

J n 


(5-11) 

From a fairly well-known property 




pin 

Jo = J ddlogcosd= -|7rlog2. 


( 6 - 12 ) 


In (6T0) we shall be concerned only with even positive integer values of c. We have at once 




n j. n T_37r r t _ 3^ 

2 > = ^ 8 ~ 32 ’ 


(5-13) 


From (5‘ 11) >^sc= f coa^° d log cos 0 = (* cZ(sin 0) cos^'’"^ 0 log cos 

Jo Jo 

which, by partial integration, 

• aln ? • /I ^ 9m a COS^~^ 0 sin 0\ 

= ddsm0l2c-l Bind cos’*®"® 0 log cos 0H ^ 1 

jo \ cosd / 

= (2c — 1) + 120-2 ~ ^2c' 

Hence 2Cf/j5 = (2c 1 ) /25 + 

From (5-12), (5-13) and (5-14), 

/„ = -|wlog2, Jj = (-60wlog2 + 377r)/384, ' 

= ( — 27rlog2 + 7r)/8, =: ( — 8407rlog2 + 6337r)/6144. ■ 

di = ( - 1271 log 2 + 77r)/64, 


(6a4) 


(5a5) 


Noting that /^o = substituting in the right side of (5-10) the values of 1 and J given 

by (5-13) and (5'16), we findp'(4) = 0 . Table 7 gives the values of the discriminant for certain 
values of c. 

The discriminant accordingly assumes a maximum value for c = 4 , a result so remarkable 
that one might be inciined to suspect that it is a consequence of the form which was assumed 
for the alternative to the normal curve, a form which, in placing such emphasis on A 4 , 



231 


R. C. Geaby 

high-lights, so to speak, 62 (= A^-f- 3 when Ag = 1 for indefinitely large samples) as a test of 
normality. From the algebraic point of view this is anything but obvious: the property 
emerges from quite a complicated piece of algebra. It may also be emphasized that the field 
of alternatives (5-1) is not arbitrary; it is a general form of frequency distribution when all 
the Ai are finite. Admittedly the discriminant takes account only of the term in A4 in the 
expansion; but this is certainly the most significant term for a wide class of frequency dis- 
tributions, namely, those of homogeneous S5ntnmetrical functions of samples of % as «. tends 
towards infinity under very general conditions for the parent universe, provided that the 
resulting frequency distribution can be assumed to have its third moment zero; for then the 
only term in in the frequency distribution of the function will be the term in A^. The 
significance of the property demonstrated must not be overstressed since it is subject to 
many qualifications, but it gives strong grounds for holding that, for very large samples, 
62 is the most efficient test of normality of tests of type a{c) in relation to a very extended 
class of alternative universes. At the same time Table 7 shows that there can be little 
difference in efficiency in the field a{c) for c ranging from close to 2 to about 6. There is but 
little doubt, on this showing, that is more sensitive than a(l), a conclusion suggested on 
the basis of certain experimental results by E. S. Pearson (1935) and examined from the 
viewpoint of power function theory in 1 6. 


Table 7 


0<o<2 

Discriminant 

P(o) 

2<c<co 

Discriminant 

P(c) 

-to 

-2-334 

2-1-0 

4-460 

0-1 

-2-641 

2-1 

4-808 

0-2 

-2-726 

2-5 

4-666 

0'6 

-3-188 

3-0 

4-801 

0-7 

-3-441 

3-9 

4-898 

1-0 

-3-758 

4-0 

4-900 

1-1 

-3-861 

4-1 

4-898 

1-6 

-4-166 

6-0 

4-818 

1-9 

-4-406 

6-0 

4-602 

2-0 

-4-460 

7-0 

4-288 



8-0 

3-906 


Adverting to (SA) in conjunction with (5-6), it might be asked if, on the analogy of the 
maximal property just demonstrated for the first discriminant, the function 


pa(c) = c(c-2)(c-4) 

has a turning point at c = 6. The answer is in the negative. The value of P2(6)//’z(®) ™ 

15/34. At the same time there must be a zero of p'^ic) very near c = 6 since 
P5^(5-9) = 8-79, P2(6) = 9-20, ^2(6-1) = 8-66. 

Analogous to the field on tests of kurtosis represented by (3-1) we may consider as a field 
of tests of asymmetry: 

g{c) = li-S' I (6-16) 



Biometrika 34 


10 




232 Testing for normality 

where E' extends to the observations less than the mean x and E" to the rest of the sample, 
lor c = 3 the test is, of course, fb-^. For normal samples 

%(c)f = Ei^-ls’\x,^x\o+^E'\Xi-x)j" (5-17) 


the denominator of whioh is identical with the denominator of (4-1). Knowing the joint 
distribution (for normal samples) of (x^ - x), (x^ ~x), ... (Geary, 1936), there is no theoretical 
difficulty in finding the mean values of the terms of the numerator for positive integer values 
of k. Here we shall be concerned only with the first and second moments, i.e. those for (6-17) 
for fc = 1 and fc = 2. We require the normal distribution of = pJi — « and the j oint distribu- 
tion of Zi and ?2 = *2 “ These are 



nzl \ 

2{n~l)j 


dzi, 


{n-l){zl+zl) 

2(n-2) 


^1^2 

(«- 2 ), 


dzidzj =/(2i, Z2)dz^dz^. 


(6-18) 


Clearly the odd normal moments of g{c) are zero. Then 


E 


.-E'\Xi-x\<= + 




where Zj) is the mean value of the two-dimensional terms. We then have 

Fi(Zi, Zj) = f (— Zx)°dZi|* (~^2y‘dZ2f(Zi,Z^)~ ( dz^( — Zif^ dZ22|/(Zi, Z 2 ) 
J-00 J— 00 J— 00 Jo 

1^00 i’O Too ^co 

- dz^zi dz2(-Z2)<=/(Zi,Za)-b dz^zi dz^zUiz^.z^) 

Jo J — 00 Jo Jo 

noo 

25z|dzidz2{/(-Zi, -Za)-/(-Zi,Z2)-/(zi, -z^)+f{zj^,Z2)} 

0 


(6.19) 



2 =+V 71 y(n-2)c+Wc \2( (c + 2)2 (c -1- 2)2 (c -f 4)2 ) 

2tt U-2 / (w- 1)"+ 2\27 r'^3!(n-l)2+ 5\(n~lf 

(5-20) 



(5-21) 

Also 


( 6 - 22 ) 


We now have all the expressions required for the variance of normal g{c). We require, for 
what follows, only the term in n~^ which is 


if 

/ 2 c- r 


c \2 2 ‘'+i 1 

n[ 

1 2 , 


27 TT , 


Consider now a field of alternative universes represented by 


(S-23) 


(5-24) 


the ‘first approximation to the law of error’ (for universal variance unity), obviously the 
most appropriate asymmetrical field, for different values of the parameter A 3 , and con- 



233 


R. C. Geaey 


taining as a member of the field the normal distribution found for A3 = 0. For indefinitely 
large samples from (6'24) the mean value of ^(c) is 


2 A 






From (5-23) and (6-26) 


5 

O' 




He), 


(5-25) 

(5-26) 


the skew discriminant t{c) being given by 
Log-differentiating, 



-^c+a 

2c -fl Jg+j 



(6-27) 


r'{c) _ 

1 2<=+i 

f 1 , 

/2‘4<!+2 

-^2c+2‘4fl\ 

t{c) 

c-1 2 

l2c-t-l 

1 ^0+1 

) 


Va 2 4+2 ^og 2 ) f 2°+^ 4+^ 1-1 

/„+! (2C-I- 2C-H 1 / l2c+ 1 j 


(5-28) 


Setting c = 3 and using (5-13) and (5-15), we find that t'( 3) = 0. Values of t(c) for four 

values of c are as follows : , , , , 

c t(c) c t(c) 

2 2-370 4 2-389 

3 2-450 6 2-236 


Accordingly, for indefinitely large samples the test of asymmetry g(c) is most efficient for 
0 = 3, when the test becomes the familiar ^/^i. The margin in favour of this value of c, as 
compared with others in the range 2 <c < 5, is, however, quite small. 


6. Tests op ktjrtosis prom the power punotion viewpoint 

It may be useful to open this section with an interpretation of the results of the previous 
section from the point of view of the type of error theory of J. Neyman & E. S. Pearson 
(1933, 1936), For this we consider two universes of the field, the normal Wq and any non- 
normal universe W^, and two tests of kurtosis a{4) = 63 and a{c^) for a particular value of 
c. Suppose that samples are sufficiently large that a(c), for samples from all universes of the 
field, may be regarded as normally distributed. 

Given a probability a, a sample number n can be found so that the mean value of a(Ci) 
from Wi lies exactly at, say, the upper a probability point of the distribution of a{c^ from Wg. 
Then from the results established in the preceding section the value of a(4) for the same sample 
of n from could lie beyond the a probability point of a(4) for normal samples of n. 
Suppose that the rule adopted was to regard as non-normal all samples for which a(c) 
hes beyond the normal a probability point, and suppose that a very large number N of 
samples were drawn, iV;, from universes not significantly different from normal (defining 
‘insignificance’ in some manner) and Ai from non-normal universes, so that N = N^+Ni, 
where Nq and are not necessarily known in advance. Then using a{ci) the number of 
erroneous allocations will be approximately oL^o-i- whereas using a(4) the number will 
be aiV'o-l-(|-p)Ai (|^>p>0), showing a definite advantage in favour of a(4). The same 
conclusion emerges whatever value of c#=4 or whatever non-normal universe be taken 
for comparison. 

The type of error approach reveals the theoretical weakness of using the method of § 5 
for the assessment of relative efficiency of tests of normality ; namely that the proportion of 

16-2 



234 


Testing for normality 


errors of judgment, even using a(4), remains large, due fundamentally to concentrating on a 
single value (the mean) as typical or representative of samples from the non-normal universe ; 
it is also a disadvantage that the sample number % is necessarily a function of the particular 
value Cl of c. The method has further disadvantages of which the principal are perhaps (i) a 
somewhat restricted field of alternative universes; (ii) the assumption that the samples were 
indefinitely large, essential to justify the normality of a(c) for samples from any member of 
the universe field. 

The Neyman-Pearson power function approach which will now be considered cannot be 
regarded as entirely free from these objections in its application to the material so far 
available from this research. It enables us,. at any rate, to contemplate samples which, if not 
small, are within the range of experimental practicability. 

The problem of the relative efficiency of the different members of a field of tests of kurtosis 
a(c) will now be considered in its power function aspects.- For the present purpose the power 
may be defined as follows : 

Given a probability a (say 0-01), a sample number n, a particular value of c and a 
non-normal parent universe Wi, the power, in relation to these data, represents the frequency 
of a(Ci) for samples drawn at random from lying beyond the a probability point for a{Cj) 
computed from samples drawn from a normal universe. The greater the power the more 
discriminatory the test. Accordingly, it is in theory necessary to know the frequency dis- 
tribution of a{c) for all sample sizes, for all values of c and for all universes. Considering that 
the only frequency distribution of the field contemplated which can he regarded as deter- 
mined for all sample sizes is a(l) for normal samples (Geary, 1936, 1 936), many compromises 
are necessary to give any kind of practical effect to the power concept. The compromises 
proposed are as follows: 

(1) The form ai{c), given by (3-2), is used instead of the form a{c) given by (3'1). 

(2) Only large samples are dealt with, 

(3) The field of alternative universes is restricted. 

Using ai(c), the first four moments (from the origin) of ^^(c) for samples from any universe 
can be expanded without real difficulty, and so approximate frequency distributions (using 
the Karl Pearson or Gram-Charher systems) can be obtained. As to (1), from experiments 
in a{l) and a(4) the writer has verified that, for medium-sized normal samples, there is little 
difference between the probability points (e.g. 0-01, 0-06) of ^^(c) and a{c), though the higher 
semi-invariants (given n) are larger for the latter. In regard to (2) and (3) little confidence 
could be reposed in the values of the moments computed from expansions even to n~^ unless 
the sample number was at least of the order of 100 when c is greater than, say, 3 ; and, even 
if the moments were known exactly, the empirical frequencies would be more than doubtful 
for small samples. The approach finds its main justification in the consideration that any 
errors due to these necessary compromises may be presumed to apply more or less equally 
arid in the sanie direction to the tests of kurtosis compared; generous, perhaps too generous, 
advantage is taken of this justification in the concluding part of this section. 


Set, then, 
so that 


« = Vi = (1 I'-iWlolj/^lcl. Zi = 


(6-1) 

(6-2) 

(6-3) 


where 



E. C. Geary 


235 


the universal mean being taken as zero, without loss of generality. Raising (6-2) to powers 
1, 2, 3, 4, expanding to the required degree the final factor, multiplying by the first factor 
on the right, and setting down the mean value of each, term we find, to 

Milcc = l-i{^l«(ll)-TO2)} + ^ {A^i>(12)-4«[(03) + 3(11)(02)] + 34«(02)2} 

+ ^ (02) - (13)] -tA«[(04) - 3(02)2 + 4(11} (03) + 6(12) (02)] 

- 4«[10(03) (02) + 16(11) (02)2] + 164W(02)}, (6-4) 

== l + i{4®>(02)~2fci2)(ii) + (20)}+l{-ii;O)(03)^3^(2)(02)2 

+ 2/42)(12) - 6/42>(ll) (02)- /tf(2l) + &(|)(20) (02) + 2&<|)(ii)2} 

+ ^ (^^'t{04) - 3(02)2] - lOfc^HOS) (02) + 16A;i2>(02)2 - 2l:®[(13) - 3(11) (02)] 

+ 4A:i2)[2(ll) (03) + 3(12) (02)]-30ii:®(ii) (02)2 

+ ]c'i\{22) - (20) (02)] - 42)|-(20) (03) + 3(21) (02)] - 2ife®(ii)a 

-642>(i2)(ll) + 12/42)(li)2(02) + 3/cf(20)(02)2}, (6-5) 

= l + ;i{4«>(02)-3^:f(ll) + 3(20)}+;i{-4®(03) + 3^(«(02)2 
+ 3ifc'j»’(12) - 9fc^»)(ll) (02) - S&f (21) + 3/b|>(20) (02) + 61:®(11)2 
+ (30) - 3]fci«(20) (11)}+ i{jfcl?'[(04) - 3(02)2] _ io/c®(03) (02) + 

yb 

- 34«[(13) - 3(1 1) (02)] + [2(1 1) (03) + 3(12) (02)] - i5kf{n) (02)2 

+ 3/4®[(22) - (20) (02)] - 34»[(20) (03) + 3(21) (02)] + 94»(20) (02)2 

- Qlcfi 11)2-1 842>( 12) (11) + 36/fc)f>(l 1)2 (02) - lcf{dl) 

+ kf{^Q) (02) + 3fcf (20) (11) 

+ 343'[(20) (12) + 2(21) (11)] -94®)(20) (11) (02)- 642)(ii)3}, (6-6) 

= l + -{4«(02)-4A:®(ll) + 6(20)}+~{-4^>(03) + 3/4*W 

72 * fv 

+ 4/4«(12) - 124 «( 11 ) (02) - 6A:f (21) + 6^:W(20) (02) + 12AW>(11)2 
+ 4(30) - 124«(20) (11) + 3(20)2} + 1{4«[(04) _ 3(02)2] 

71 

- (02) - 16fc<«(02)a- 4fe^)[(13) - 3(11) (02)] 

+ 87c®[2(l 1) (03) + 3(12) (02)] - mkf{\ 1 ) (02)2 + 6fcW[(22) - (20) (02)] 

- 6/c®[(20) (03) + 3(21) (02)]+ 18fc®(20) (02)2- 12^:®( 11)2 

- 36/t®(12) (11) + nkf\llf (02) - 4Jcf{Zl) + 44^>(30) (02) 

+ 12/t®(20) (11) + I2kf {{12) (20) + 2(21) (11)] 

-mf{20) (11) (02)-244«(ll)3 + (40)-4M*H30) (ll)-3(20)2 

- Qkf\2Q) (21) + 34*’(20)2 (02) + 12ib^«(20) (11)2), (6-7) 

lpc{\pG + l)j ^pc+2)... {\!P0+r~l)^ {fg) = Ey{zl 


where 



230 Testing for normality 

the latter, of course, the same for all i. The (/gr) required for the computation of (6-4)-(6>7) are 

(11) = (/t)2+cl 

( 02 ) = 

( 12 ) = + 

(03) = (/i6-3/q^s+2/4|)//4, 

(04) = (/ig— 4,Wg,tt2+ 3/<|)//<2 i 

(13) = [/i|6+(,( - 3/i|4+ei/“2 + ¥|s+(!I/“'1 ~f>‘\o\{N - + ^l4)Vhc\f4> _ 

(21) = [/i|254.ai — 2/4|g+2i/<i(,|— /a2(/h2i;i"'^/^f<!|)]/y“id/*2> 

(22) = (/i|2(!4-4| “ 2/t|2e+2|/f2 +/i|Zc| AI ~ ^/^lc+41 A|c| + ^/*l(!+2l/*|c|/*'2 ~ ^Alcl /^1 + /^|(!|/^'4)//^|(!| /^1> 

(20) = (/%<,! -A)Ko|. 

(30) = (/i|3cl ■" ^/*^|2c|/^lol "h 2/a|5|)//i|o|i 

(31) = (/i|3c+2| — 3/i|2(!+2|/*|c| + S/^lc+alA ~j“'|ao|/^2+ 3/t|2c|/i|cl/"'2 “ ^/'■fcl/"'2)//‘fel/''2> 

(40) = (/4|4c| — 4/t|3j|/t|o| + S/iiaci/ifcl - 3/Afc|)//<'tcl' 

(6'8) is, of course, an immediate consequence of (6-3). The writer has cliecked the accuracy 
of formulae (6-4)-(6-7) by reference to the normal universe for c = 1. 

The reader will have no illusions as to the magnitude of the task of applying the foregoing 
theory to particular cases. The formulae are set down, however, in the hope that other 
researchers will be sufficiently sensible of the importance of the theory to assist in building 
up a fairly extensive set of results. The writer has to be content, in the meantime, to consider 
the case of the symmetrical universe field given by 



when A4 = the normal being given, of course, for A4 = 0, and for c = 4 and 0 = 1 . These 
values of c are selected because the theory in § 6 has suggested that a(4) is probably the most 
efficient of the test-field a{c), while a(l) is the only member of the field for which the normal 


Table 8. Moments from formulae (6-8) 


{/!/) 

c = 4 

c 

= 1 

Normal 

K=i 

Normal 

h=i 

(11) 

4 

6-428671 

1 

1-17021276 

(02) 

2 

2-5 

2 

2-5 

(12) 

24 

46-64286 

3 

4-88297871 

(03) 

8 

14 

8 

14 

(04) 

60 

138 

60 

138 

(13) 

216 

544-2857 

21 

44-106383 

(21) 

266/3 

177-71428 

1-141593 

1-76644898 

(22) 

2,720/3 

2,481-92857 

7-707963 

14-766814 

(20) 

32/3 

16-142867 

0-570796 

0-63834981 

(30) 

362 

799-142857 

0-429204 

0-6405182 

(31) 

4,352 

12,785-2863 

3 

5-236134 

(40) 

23,552 

73,260-178 

— 

2-002492 




R. C. Geary 237 

distribution is known for samples of aU sizes. The necessary moments [fg) given by ( 6 ' 8 ) 
are shown in Table 8 . Based on the values in this table, moments {M') given by ( 6 - 4 )-( 6 - 7 ) 
of a^{c) and semi-invariants (i) derived therefrom are as follows. The normal values are, 
of course, known exactly but were computed for the purpose of checking the formulae: 

c = 4; normal universe 
2 ^ 8 

3 “ 3 ~ 

9 3 a 9 ’ 

2 48 1040- ^3^64 2368 

8 40 3520 J:4_3840 

81 n» ’ 81 " a® ' 


A 

3'6 

( 3 - 5)8 


Mi 

3-6 ' 

:=!- 


0 = 4; universal A 4 = ^ 

, 3-367 11-822 12-1 

1 + — r~+~T-’ 

2-286 67-34 776-03 L 


n n‘^ 

3-215 107-47 2853-89 








(3-5)8 


4-4286. 92-25 831-2 
-f 


n 




n 


3 > 


144-61 6193-95 


(3-6)8 

(3-5)'^ 


n 


n‘ 




, 13-143 20-49 9529 

: 1 -1 1 ^ • 

n w 


(3-6)8 

L, 


n‘‘ 

10,687 




3 ' 


(3-6)^ 

0=1; normal universe 




7 - 0-19947114 0-02493389 0-03116737 

Li = ikf; =0=0-7978845608-1- -t- 


n 


0-04507034 0-07957747 0-03978874 


a-* 


n 


n‘ 


a-* 


^3=^- 


0-01685646 0-07613597 


n‘ 






0 = 1 ; universal 
0-35239362 0-169616 0-746838 


/ilil /^iii n n- > 

0-79792429 0-458012 1-800648 0-09313706 0-262961 0-196477 


/‘fil ' 


= 1 + 


■■1 + 


a a® a® 

1-336592 0-850081 3-239101 


a 




L. 0-063366 0-204164 
+- 


a” 


/‘fii 


a-^ 


a-* 


a-" 

= 0-78126197. 

Two sample sizes were considered: a = 100 and a = 500. Bor a = 100 and c = 4, the 



238 


Testing for normality 


following are the Pearson Type IV frequencies of afi) when the parent universes are normal 
and have - 3 = | respectively; 

Normal: A4 = 0. i 


tan0 = (a;-l-873387)/0-765849, [ 
logio K = 3-2644696. ] 


( 6 - 10 ) 


A4 = K cos®'®®®® d dx, 

tan0 = (a:-2-8622)/0‘9062, 
logio K = T-7499974. 


( 6 - 11 ) 


The normal probability points shown in column (2) of Table 10 were derived from the fore- 
going normal frequency (6-10); the points in column (3) were derived from a Gram-Charlier 
formula (Geary, 1935). The 0-01 and 0-05 points given in column (2) are practically identical 
with those given by E, S. Pearson (1929) for a(4), namely, 4-39 and 3-77. The powers given in 
column (4) are the aggregate frequencies lying beyond the values of the variate shown in 
colufnn (2) on the assumption that the actual frequency was (6- 11). The corresponding 
figures for c =: 1 given in column (5) were based on a Gram-Charher formula. 


Table 9. Power ofa^{c)for c = 4 and c = 1 of discriminating (6-Q)for A^ = \from 
tlie normal (A4 = 0) at four normal, theory probability levels. Samples of 100 


Normal theory 
prohahility 

(1) 

Normal theory probability points 

Power for frequency (6-9) with Ai= ^ 

II 

C=1 

(lower) 

(3) 

c = 4 

(4) 

C=1 

(5) 

O'Ol 

4-3836 

0-7482 

0-0648 

0-0696 


3-7744 

0-7642 


0-1979 


3-6196 

0-7726 




3-3110 

0-7824 

0-4626 



Before discussing the comparative powers in Table 9 it will be convenient to give a 
table, 1 1 , on the same lines but for n = 500. On account of the larger sample size it has been 
necessary to change the reference-probabilities given in column (1). Eor the construction 
of this table Gram-Charlier formulae were used throughout — the probability points being 
determined from the E. A. Cornish & R. A. Eisher (1937) formulae — after verifying that 
for two of the probability levels, 0-01 and 0-06, the probability points for c = 4 (column (2) 
above) did not differ appreciably from those given by E. S. Pearson, namely, 3-0O and 3-37 
(for a(4)), based on a Type IV curve. 

The analysis in § 5 has enabled us to come fairly firmly to the conclusion that for indefinitely 
large samples a(4) was to be preferred to a(l) as h test of normality. We see from Tables 9 
and 10 that this is subject to an important qualification. Table 9 shows that the discrim- 
inating power is definitely greater for samples of 600 for a(4) than for a(l), but the superiority 
is less emphatic than might have been anticipated from § 5. Eor medium-sized samples 
(Table 9) a(4) exhibits no superiority. Of couree, these conclusions are very tentative, 
as being based upon a single alternative and on particular sample sizes. The writer had 
proposed, in addition, to examine the universes (i) A3 = 0, A4 = 1 and (ii) A| = A4 = ^ as 
alternatives to the normal but time did not permit; he ventures to repeat the hope that other 
students will take the matter up. 










R. G. Geary 


239 


Table 10. Power of a-y{c)for c = 4 and c = 1 of discriminating {6-9) for = | 
from the nortnal (A4 = 0) at four probability levels. Samples of 500 


Normal 

probability 

(1) 

Normal probability points 

Power for. frequency (6-9) with A 4 = J 

"Sip 11 

<5=1 

(lower) 

(3) 

li 

0 

c= 1 

(5) 

0-006 

3-7002 

0-773167 

0-1934 

0-2067 

0-01 

3-6094 

0-776684 

0-2920 

0-2790 

0-06 

3-3706 

0-782482 

0-6966 

0-6196 

0-10 

3-2095 

0-786058 

0-7392 

0-6609 


7. CoNCLtrSIOlT AND SUMMARY 

In § 2 of the present paper it is shown that the actual probability of differences between 
means and variances derived from random samples on the- nul-hypothesis may differ 
considerably from the probability derived from the standard tables (compiled on the 
assumption that the universal distribution is normal), when, in fact, the universal distribu- 
tion is not normal. Accordingly, the standard tables cannot validly be used unless tests, 
based on the sample from which the inferences are to be drawn, or on a series of samples 
produced under similar conditions, have established the likelihood that the universal 
distribution is approximately normal. In certain cases — but these must be few — the nature 
of the material may, of itself, suffice to justify the assumption of universal normality. 
When universal normality cannot be assumed, the best course will be to correct the standard 
tables using, for this purpose, the moments (up to, say, the fourth) derived from the sample, 
in conj unction with the formulae givenin § 2. This procedure is, of course, open to the obj ection 
that the moments derived from the sample may, in fact, differ substantially from the (in 
general unknown) universal moments, so that any probabilistic inference derived using 
sample moments must be accepted with reserve, lib^ ~ 3-5, say, it would be safer to assume 
that the universal value is 3-5, than to hope (without other evidence) that it is 3, the 
normal value; it might be 3-76 or even 4, when, usually, the standard table probabihties 
will be still further astray. It should not be difficult to construct supplementary tables 
giving very approximate corrections of the standard tables, using the moment expansions 
given in § 2, for different values of a//?! and yffg. To compute unbiassed estimates of the latter, 
R. A. Fisher’s k statistics (1929) should, of course, be used. 

It may be asked if testing for normality and, when necessary, correction for universal 
non-normality is worth the trouble. To answer this question it is desirable to have regard to 
the logical position of the statistician, concerned with drawing inferences from samples, 
whose characteristic approach may be defined as reductio ad paene absurdnm: if an event is 
highly improbable it must be regarded for practical purposes as impossible. St Thomas 
Aquinas’s* famous ‘certitude of probability’ is peculiarly apt as applied to the mental 
attitude of the statistician, from two quite different viewpoints. The first is that decision, 
and action based on that decision, for which there is not certainty, but merely probabilistic 
preference, is absolute. One does not say that one has a preference of 20 to 1 for Fertilizer A 

* ‘According to the Philosopher, certitude is not to be sought equally in every matter.. . .Hence 
the certitade of probability suffices, such as may reach the truth in the greater number of cases, although 
it fails in the minority’ {Sumrna lla-llae q. Ixx, a. 2). 




240 Testing for normality 

over Fertilizer B because the differences between the yields is at or near the 6 % probability 
point of some test functions; one necessarily decides without qualification that A is better' 
than B. 

The second aspect, which has the greater relevance in the present case, is that the statis- 
tician regards himself as endowed with ‘certitude’ when he Imows that if he repeated an 
experiment, as to, say, significant differences in averages, a great number of times, he would 
be in error in attributing significant difference when, in fact, there was none, in a predeter- 
mined proportion of cases. He has certitude as to the probability though his decision in the 
individual case may be wrong. What is curious is that decisions (which, in effect, are absolute) 
can be baaed on probability levels which vary with the temperament of the statistician fropi 
perhaps a conservative 0-001 to a daring 0-1. For the particular statistician the probability 
level will vary Avith the case: for instance, the present writer would be inclined to suspect 
non-normality near the 10 % probability level of the a(l) table, whereas he would not he 
disposed to attach significance in, say, analysis of variance, until about the 2^ % level. 
Naturally the level will depend on the importance attaching to the decision. 

Since all the statistician usually requires from the table of probability for a given measure 
of significance is whether, on the nul-hypothesis, the probability is ‘small’, absolute 
precision is not necessary in the probability. If the probability is thought to be minute, say 
0-001, it does not matter if in actual fact it is 0-002 or 0-0005. If, on the contrary, the standard 
table value is approaching the statistician’s level of decision it surely matters a great deal: 
if he thinks his judgment is likely to be erroneous in 1 out of 20 experiments it must be of 
importance if, in fact, the true probability is something like 1 in 10 or 1. in 6. These are the 
kinds of contrasts that appear from § 2, from comparison of standard table probabilities 
with ‘actual’ probabilities found when the samples were assumed to be randomly drawn 
from certain arbitrarily selected types of non-normal universes. The computed probabilities 
in § 2 admittedly make no claim to exactitude in most of the cases, since the formulae were 
strained by their application to small sample theory. The point is, however, that the estimates 
of the actual probabilities are unbiassed hi regard to the ‘normal theory’ probabilities: 

' if the former could be closer to the latter, they might also be further away. 

There is one case which is in a quite exceptional category, namely that considered at the 
beginning of § 2. As far as the writer is aware, this case has never been examined theoretically 
before, despite the extreme simplicity of the algebra. It is shown that in the simplest case 
of analysis of variance, when the two sample numbers are of the same order of magnitude, 
the variance is proportional, approximately, to — so that quite a small measure of 
universal kurtosis materially changes the probability. Statisticians must have been affected 
by a kind of hypnosis in favour of normal theory to have overlooked so trivial a point, 
a stricture from which the writer is not particularly concerned to exclude himself! An 
exception was E. S. Pearson (1931) who, on the basis of his results cited in § 2 (a), sounded 
a warning: ‘The illustration should serve to emphasize the fact that certain of the “normal 
theory” tests can be used with greater confidence than others when dealing with samples 
from populations Avhose distribution laws are not known.’ 

An interesting chapter could be -written on the fluctuations in the attitude of statisticians 
during the past century on the question of the occurrence of the normal frequency distribu- 
tion in nature, a chapter, perhaps , in a large work on Fashions in the Sciences down the Ages . 
Amongst the following the historian may find the reasons for the prejudice in favour of the 
hypothesis of universal normality up to, say, the end of the last century; 



R. C. GrEABY 241 

(1) The fact thatj to a close approximation, it applies in a wide range of mathematical 
conditions. 

(2) The fact that the theory found practical applications predominantly in assessing the 
probability of errors in astronomical measurements and in games of chance where the 
mathematical model could reasonably be assumed to apply. 

(3) The beauty of the mathematical theory and the facility of algebraic manipulation in 
the function involved. 

(4) The general shape to the visual sense of such frequency distributions as were known, 
before imposed its discipline. 

With the development, about the beginning of the century, of the theory of moments, 
statisticians became almost over-conscious of universal non-normality. The concomitant 
semi-invariant approach had quite a different background. The difference between the 
moment and Karl Pearson curve system on the one hand and semi-invariants and the Gram- 
Charlier system on the other is fundamentally that for the former normality is a particular 
case like any other, whereas for the latter normality is basic and generative. Each system 
has its advantages and disadvantages as applied to the determination of frequency dis- 
tributions of which the lower moments are known. In fanciful terms one might say that in 
the ship Gram-Charher one might sail in perfect safety but only within limited, and more 
or less ascertainable, range of Port Normality, whereas in the good craft Pearson one can 
sail the seven seas — at one’s own risk.* 

Our historian will find a significant change of attitude about a quarter-century ago following 
on the brilliant work of R. A. Fisher who showed that, when rmiversal normality could be 
assumed, inferences of the widest practical usefulness could be drawn from samples of any 
size. Prejudice in favour of normality returned in full force and interest ia non-normality 
receded to the background (though one of the finest contributions to non-normal theory 
was made during the period by R. A. Fisher himself), and the importance of the underlying 
assumptions was almost forgotten. Even the few workers in the field (amongst them the 
present writer) seemed concerned to show that ‘universal non-normality doesn’t matter’: 
we so wanted to find the theory as good as it was beautiful. References (when there were 
any at all) in the text-books to the basic assumptions were perfunctory in the extreme. 
Amends might be made in the interest of the new generation of students by printing in 
leaded type in future editions of existing text-books and in aU new text-books : 

Normality is a myth', there never was, and never will he, a normal distribution. 

This is an over-statement from the practical point of view, but it represents a safer initial 
mental attitude than any in fashion during the past two decades. 

As already indicated, the present work is incomplete, especially on the experimental side. 
The writer hopes that he has created a prima facie case for the importance of testing' for 
normality. 

Summary 

(i) Inferences drawn from the standard (normal) tables of z and t may be seriously in 
error if the conditions in which the standard tables apply (the principal of which is that the 
universes from which the samples are drawn are normal) are ignored. 

* This comment must not be taken as applying to the problem of curve-fitting, i.e. to fitting a smooth 
curve to given frequencies, but to the problem of estimating the frequency function given the first 
few semi-invariants. 



242 


Testing for normality 

(ii) Sufficient conditions are given for the approach to normality, with increasing sample 
size, of the field of tests of normality a{c) (given by (3-1)) for c> 0. 

(iii) Many term expansions of the first four moments of a{c) for normal samples are given 
with practical applications designed to find the values of c for which the moments could 
be used with confidence to find the frequency distributions for medium-size samples; semi- 
invariants of %(2-4) and ai(4) (^^(c) is given by (3'2)) are compared; correlations between 
af^o) and ai(c') are examined. 

(iv) For indefinitely large samples and a wide field of alternative universes a(4) is found 
to be the most sensitive test of kurtosis and an analogous test of asymmetry g{c) is found to 
be most sensitive for c = 3, g{Z) being the familiar 

(v) An examination of the relative efficiency of o(l) and 0.(4) from the Power Function 
point of view suggests that a(4) is increasingly to be preferred as the sample size increases; 
for samples of moderate size a(l) is probably as efficient as a(4). 

(vi) Throughout the paper a considerable range of formulae is given in case students may 
feel interested to carry the writer’s researches a stage further so as to give a firmer basis to 
his conclusions or to modify them. It is suggested (§ 4) that the preparation of a table of 
probability points of a(2'2) for normal samples of different sizes be taken in hand. 

REFERENCES 
Baker, G. A. (1932). Ann. Math. Statist. 3, 1. 

Bartlett, M. S. (1936), Proc, Oamb. Phil. Soo. 31, 226. 

Cornish, E. A. & Fisher, R. A. (1937). Bev. Inst. Int. Statist. 5, 307. 

Craig, C. C. (1928). Metron, 7, 3. 

Eden T. & Yates, F. (1933). J. Agrio. Sci. 23, 6. 

Fisher, R. A. (1926). Metron, 5, 90. 

Fisher, R. A. (1929). Proc. Land. Math. Soe. (2), 30, 199. 

Fisher, R. A. (1930). Proc. Boy. Soc. A, 130, 16. 

Fisher, R. A. & Wishabt, J. (1931). Proc. Lond. Math. Soo. (2), 33, 196. 

Frbohbt, M. (1937). Gineralitia aur lea Probabilit^a. Variables aliatoirea. 

Geary, R. C. (1933). Siometrika, 25, 184. 

Geary, R. C. (1935). Biometrika, 27, 310, 363. 

Geary, R. C. (1936), J. Boy. Statist. Soc. (Supplement), 3, 178. 

Geary, R. C. (1930). Biometrika, 28, 296. 

Geary, R. 0. (1947). Biometrika, 34, 68. 

Geary, R. C. & Pearson, E. S. (1938). Tests of Normality. 

Geary, R. C. & Worixedge, J. P. G. (1946). Biometrika, 34, 98. 

Gosset, W. S. (1908). Biometrika, 6, 1. 

Hsu, C. T. & Lawley, D. N. (1940). Biometrika, 31, 238. 

Kendall, M. G. (1941). Biometrika, 32, 81. 

Naib, a. N. K. (1942). Sankhyd,, 5, 393. 

Neyman, j. & Pearson, B. S. (1933). Philos. Trans. A, 231, 289. 

Nbyman, j. & Pearson, E. S. (1936). Statist. Bes. Mem. 1, 1. 

Pearson, E. S. (1929). Biometrika, 21, 337. 

Pearson, E. S. (1930). Biometrika, 22, 239. 

Pearson, B. S. (1931a). Biometrika, 22, 423. 

Pearson, B. S. (19316), Biometrika, 23, 114. 

Pearson, E. S. (1936). Biometrika, 27, 333. 

Pearson, E. S. & Adyanthaya, N. K. (1929). Biometrika, 21, 259. 

Pearson, Karl (1895). Philos. Trans. A, 186, 343. 

Pepper, Joseph (1932). Biometrika, 2^, 5b. 

RmEB, P. R. (1031), Ann. Math. Statist. 2, 48. 

Rietz, H. L. (1939). Ann. Math. Statist. 10, 266. 

Shbwhart, W. a. & Winters, F. W. (1928). J. Amor. Statist. Asa. 23, 144. 

Wishabt, J. (1930). Biometrika, 22, 224. 

Yasukawa, K. (1934), Tokohu Math. J. 38, 466. 



[ 243 ] 


THE STRATIFIED SEMI-STATIONARY. POPULATION 


By S. VAJDA 


1. Constant popitlation 

Let a set of non-inoreasing real values fo = 1, .■■,Pn>Vn+\ = 0 be given, and let 

represent the probability of a person of age 0 surviving the i following years. Further, let 
Zq, lx, Z,j represent the numbers of persons of age 0, 1 , . . . , » living at time i = 0. We con- 
sider then the development of such a population during the years following t = 0, under the 
assumption that the probabihties Pi remain the same throughout the period investigated. 
Only persons of age 0 are to enter the population, and the number of such entrants shall 

n 

be such that the total of the population is kept constant at a number H = At the end 

t=0 

of the first year the survivors of the H persons who were alive at Z = 0 will be (if we put 
hIPi = say) n-\ ^ ■ 

i=0 Pi 


n— 1 
i*»0 


and therefore the number of entrants at the beginning of the second year (i.e. at t = 1) is 
»— 1 

(j>x = H- S r^pi+x- 


{=0 


By the same argument the entrants at < = 2 will be 

n—^ 

= ^-<fiPx-I>hPi+z> 


and so on; generally 


1=0 


n—t 


(p(_xpx - <Pi-2P2 - • • • - <l>iPt-i - S riPi+i, 


i=0 


( 1 ) 


as long as Z < that is, as long as there are survivors of the initial population. For t>n 

we obtain ^ ^ 'Pt+<Pi-iPi+i>t-zPs + - +^t-nPn- (2) 

We want to find an expression for which must obviously depend on Z^, l^, ■■■, Z„. Now 
(2) is a difference equation for the function <j>i of Z and can easily be solved. For this purpose 
consider the ‘characteristic equation’ 

ic” -H x^~'^px + + • • • + ^Pn-l +Pn, = 0- (3) 

Let this equation have the roots Xx,x^, ...,x^, where is a fc^-fold root and x^^x^. We have 
then as a solution of the difference equation (2) 

<l>f=^Hx + Px{t)x{ + ...+PM<, (4) 

where Hx = HjP^Pi and Pi{t) = a,^x^a.^^t + The ay must be found from the 
initial population, i.e. from equations of the form (1) which centain the first n numbers of 
entrants <px,<f> 2 , ■■■, <j>n- ^7 inspecting, these equations, which are of the form (Z < n) 

H = i>t+ ^t-xPi + • • • + i>iPi-x + ^•o A + HPm + ■ • ■ + "^n-lPn^ 



244 The stratified semi-stationary population 

that they are equiralent to 

r; = S I (6) 

i=li=i 

Hence the can be fixed, dependent on the r^ - Ijpi and thus on the initial population. 
We have thus proved: 

Ifapopulationmthanagedistributionl(„li,...,lnissubjecttosurvivalratespi{i = 1,2, 
and if this population is kept constant by entrants of age 0 at the end of the tth year,- then <j)f 
is given by (4), where the are the different roots of (3), and the tXy must be found from the set (5). 

The population after i years will have the following age distribution: 

9^(1 4'i—i.Piy •••> ^t—nPn‘ 

It can easily be proved that, if the p^ are decreasing (and not merely non-increasing), then 

for all the roots of equation (3) we have j ( < 1 and that any real root must be negative. 

Hence the will oscillate around their limit lim^^ = The age distribution of the popula- 

£=00 

tion thus tends, again through oscillations, to Hi,HiPi, ...,HiP^, which may be called the 
intrinsic stationary population. Obviously, if the initial population has already this dis- 
tribution, it will not alter any more and the number of entrants will be constant and = Hj. 
In such a case all = 0, and r.^^ — 1^, whatever the naay be. 

On the other hand, ifp^ = holds for one or more values of i, then we may get cycles, 
and this is easily seen for the equation ... -I- a: 4- 1 = 0. All roots have modulus 1, 

and it depends on the initial population whether we are dealing with the stationary case or 
with periodic cycles. Ho tendency towards an intrinsic stationary population appears in 
such a ease. 

Example. Let us assume that we have the following probabilities of survival: 

Pi Pi Pi Pi Pi Pi 

7/8 49/96 6/32 13/384 1/384 0 

The characteristic equation can then be written 

384a:®-|- 336a:* -t- 196a:® -p 60*^4- 13a; -f- 1 = 0, 

which has the five different roots 

The initial population will be assumed to be 

ifl li I 2 I'Z li li 

859 1269 229 50 116 66 

which implies = 1000 (approx.). Therefore the = Ifp.^ are 





ra 


^5 

869 

1460-38 

448-86 

319-14 

3406-42 

21420-90 


Tromr^ = 1000-faja:j^*=-f a^a;^* {k == 1, 2, ...,6) we find 


*1 

0^2 

«3 

“4 

70 + 60i 

-70-60i 

0 

0 


It follows that the number of entrants in the year i(=0, 1,2,...) will be 

= 1000 ^l + {~ 0-07 + 0-06i) ( - H V- A)‘+ ( ~ 0-07 - 0-06i) ( - J - ^- ^) + -^gy] • 



S. Vajda 245 

These numbers are given in the first row of Table 1, which shows the evolution of the whole 
population. 

Table 1 


t 

0 

1 

2 

3 

4 

6 

6 

7 

8 and after 

Age 0 

859 

996 

1025 

989 

1001 

1001 

999 

1000 

1000 

1 

1260 

762 

872 

897 

865 

876 

876 

874 

875 (=1000x7/8) 

2 

229 

740 

, 438 

608 

623 

606 

511 

611 

510 (=1000x49/96) 

3 

50 

70 

227 

134 

166 

160 

166 

156 

166 (=1000x5/32) 

4 

116 

11 

16 

49 

29 

34 

34 

34 

34 (= 1000 X 13/384) 

6 

56 

9 

1 

1 

4 

2 

3 

3 

3 (=1000x 1/384) 


2578 

2578 

2578 

2578 

2578 

2678 

2578 

2678 

2678 


2. Two CONSTANT POPULATIONS 

All this covers well-known ground.* A new problem arises, however, when we consider two 
initial populations with two sets of probabilities of survival, say {i = 1, 2, and 

Pi (i = 1,2,..., Wj), where Po~Po — ^ and .p^^^ 4= 0. We ask now whether it is possible to 
keep both constant by the same number of yearly entrants. More precisely: 

Let the two equations 

rij ^ Tij 

= 0 and ^ 

i=0 i=0 

have the roots x^, with multiplicities and with multiplicities ...,js 

respectively. No two Xi or two yi are equal and no Xi or is zero. Under what further con- 
ditions, concerning the x's and the y’s, can the expressions and ’lp■^ then have the same 
numerical values for all integral values of t, i.e. 

{Hi-E^) + ^Pi{t)4-imy'i^^ for i = 0,1,2,..., (6) 

r 

where ^i = + Pi{t) x\ with P^(t) = + a.i 2 < . . . -f 

i*l 

and = Hj^ + ^Pi{t)y‘i with P^(t) = + ... 

i=l 

Suppose first that none of the a;’s equals any of the y’s. Then it is known that the deter- 
minant of any set of equations of the system (6) is not zero. It follows that we must have 
Pi = Ml and all a’s and y?’s = 0, hence all P^[t) andPi(i) s 0. In this case the two populations 
must already be stationary and therefore identical with the intrinsic stationary populations 
which are implied by the sets Pi and Pi, respectively. 

■ On the other hand, if some of the x’b are equal to some of the y’s, say Xi = Pi, ...,x.^~ 
and all the others are different, then we find by the same argument that Hi = Hi and P^ = Pj 
for the first m values of i, whereas all the other P^ and Pj- are identically zero. (It is, of course, 
again possible that all the P^ and P^- are identically zero and that we have, in fact, again the 
two intrinsic stationary populations.) 

If all x’s are equal to the y’s, with equal multiphcities, then the two equations are equal 
* It follows, for example, from results of P. H. Leslie (1945). 




246 The stratified semi-stationary population 

and the two populations must be identical, if they are to be kept constant by equal numbers 
of entrants. 

We have thus reached the following conclusion; If we assume that the two equations given 
above are not identical, and that the initial populations are not the intrinsic stationary ones, 
then they must be such that the two equations have some (but not all) roots equal and if we 
calculate the corresponding and (see (4) and (5) of the previous section), then those 
corresponding to the equal roots must be identical and the others must vanish. This includes 
the case where the a’s and y’s are the same, but with different multipHoities, so that the 
Pj and Pi do not all extend to the highest power of t which would be admissible by (7) or (8) 
respectively. 

If the two populations are the intrinsic stationary ones, then the numbers of entrants will 
be constant (i.e. independent of the year) and the two constants will be equal if and Only if 

Hpi Spi’ 

where the first expression refei’s to the first and the second expression to the second 
population. 

Example. In the example used in § 1 we have = a 4 = 0, and we can therefore try to 
obtain a second population which is kept constant by the same numbers of entrants as the 
first one. We construct an equation which has again the roots - J ± a/~ and - 1, but not 
-■ i- ± V" rS' S'^ch an equation is, for example, 

480a;5 + 396a:^+218a;»+62a:2+13a; + l = 0, 

which has the roots - J and — ± TST- implies that the probabilities 

of survival, i.e. p^, are 

33/40 109/240 31/240 13/480 1/480, 

and as the (and the r, = ^_() are to be the same as in § 1 , the initial population must now be 
h = Table 2 shows the development of such a population, and it will be seen that the 
first Hne is identical with that in Table 1. 


Table 2 



0 

1 

2 

3 

4 

6 

6 

7 

8 and after 


859 

896 


989 

1001 

1001 

999 

1000 

1000 

1 

1196 


822 


816 

826 

826 

824 

826 (= 1000x33/40) 

2 


668 


452 

466 

449 

466 

466 

464 ( = 1000x 109/240) 

3 

41 

68 

187 

111 

129 

132 

128 

129 

129 ( = 1000x31/240) 

4 

92 

9 

12 

39 

23 

27 

27 

27 

27 ( = 1000x13/480) 

6 

45 

7 

' 1 

1 

3 

2 

2 

2 

2 ( = 1000x 1/480) 


2437 

2437 

2437 

2437 

2437 

2437 

2437 

2437 

2437 




S. Vajda 


247 


3. Stratified POPTTLATiosr : two grades 

The results of the previous sections will now be used for an investigation of the stratified 
population.* First, we consider a population split into a lower and a higher grade in the 
following way: 

We assume that all members of age 0 are in the lower grade only, but that all other ages 
may share in both grades. Apart from mortality, which operates on all members according 
to their age, we assume that at every age a certain proportion dependent on that age is 
‘promoted’, at the end of the year, from the lower into the higher grade. Our problem is 
to discover whether this can be done whilst maintaining the totals in both grades constant; 
naturally the grand total of the population must remain constant. 

It is sufficient to deal only with the lower grade, as the numbers at each age in the higher 
one can be found by subtracting those in the lower grade from the total population at that 
age. Now the lower grade is depleted by mortality and also by promotions. If the probability 
of remaining unpromoted until age i is then the probability of not leaving the grade in 
this period is = p,-, say. Since all entrants into the population are at the same time 
entrants into the lower grade, our problem thus reduces to the following : 

Is it possible to find an initial population, stratified into two grades, such that, on the 
basis of mortahty described by the number of entrants every year necessary to keep the 
population constant is the same as that calculated on the basis of mortality-oum-promotion, 
described by ? ‘ 

We can apply our results in § 2 to this case by considering the lower grade and the total 
population as the two populations given. It follows that the lower grade can only be kept 
constant by that number of entrants which is necessary for the total population, if the latter 
is initially such that some of the P^(^) which depend on it are either identically zero or at 
least do not extend to the highest degree indicated by the multiplicities of the corresponding 
roots in = 0. In order to find a suitable initial population for the lower grade it is 

then necessary to find an equation Sp^y”'~<' ~ 0 which has the roots, with the necessary 
multiplicities, which appear explicitly in ^5, as calculated from the original equation, but 
which is not identical with it. The degree of = 0 may be lower than or equal to that 

of Ep^x^^-'^ = 0. If it is lower, then all members of the population will be in the higher grade 
at the highest age or ages. 

This condition is not sufficient, however. In view of the interpretation of the equation 
containing the Pi’s these coefficients must be positive and, as the lower grade is a part of the 
whole, we must have Pi^Pi for aU i. But it is not necessary that we have also 
If the opposite holds, this could still bear a practical interpretation. It would mean that 
reversions occur from the higher into the lower grade. 

If an equation with the necessary and sufficient properties can be found, then we take the 

= IJp^ which we had to start with and construct the initial population of the lower grade 
by writing the number at age i as = r^pf = liPtlPi- 

It will be seen that in such a population the age distributions change with the passage of 
time (tending to a stationary limit) but that nevertheless all entrants have the same com- 
bined prospects of survival and promotion. (Thus from the point of view of a member of 
the community his position is the same as if he entered a stationary population. His chances 

* Of., for the stationary case, with continuous changes, H. L. Seal (1946). 


Biometrika 34 


17 



248 


The stratified s&mirstcutionary population 

of promotion Eire unaffected, by the changes in the age distribution of those in front of him. 
But the characteristics of the population as a whole, for instance the efficiency of the staff 
from the point of view of iin employer may, of course, vary considerably.) Such a population 
will be called semi-stationary. 

Example. The population shown in Table 2 can be taken as representing a lower grade 
within the population given in Table 1. The ratios t^ = PijPi are then: 

Jq 4]^ tg t^ tg 

1 33/35 218/245 62/75 4/5 4/6 

Table 3 is constructed by subtracting Table 2 from Table 1 and thus shows the com- 
position of the higher grade. 

Table 3 



. 1 

A 

2, 1 

a 

4 

6 




Age 1 

73 

43 


62 

49 

50 

60 

60 

50 

2 

26 

82 

48 

66 

68 

66 

56 

66 

66 

3 

9 

12 


23 

27 

28 

27 

27 

27 

4 

23 


3 


6 

7 

7 

7 

7 

B 

11 


IB 

HI 

1 


1 

1 

1 


141 

141 


m 

141 

141 

141 

141 

141 


4. Stratified population: more than two grades 

Let us now split up the higher grade as well. We have then, say, k grades, with grades 2 and 
above forming the aggregate which was simply called the higher grade in §2; grade 1 is 
identical with the lower grade of that section. 

We assume further that promotions from any grade into the next higher one take place 
at the end of every year and that every promotee into any grade has to stay there for at least 
one year. Thus in any population the lowest possible age of grade p is p — 1. The actual lowest 
ages may be different, because the first promotion rates different from 0 may concern higher 
ages than these. The rates of promotion can be different from grade to grade, but depend 
within each grade only on the age,, as before. 

We shall again investigate whether it is possible to keep the total numbers of every grade 
constant, even if the age distributions of the grades are changing. 

We have seen that the age distribution of the total population, after t years, is- 

The distribution of grade 1 is, at the same time, 

and it is assumed that the set of is not identical with the set of p^. Hence grades 2 and above 
will have the age distribution 

<f>t{pQ-Po), <Pl~l{Pl-Pl), i>l~niPn-Pn)- 

Let us assume that 4‘i~v{Pv~Pv) is fh-® first item in this series which is not zero. Clearly we 
have 1. Then, as far as numbers of members (and not their individual careers) are con- 















S. Vajda 249 


oerned, this aggregate of grades 2 and above is equivalent to a population which has arisen 
from successive aimual entrants — who have been subject to rates of survival 


_ Pi>+i Pii+i 
P,-Pv ’ 




It must be understood, however, that ‘survival’ is here a balance between deaths and 
promotions into the grade, so that these rates may very well exceed unity. 

The number of annual entrants into grade 2 is given by 


<i>i-piPv-P.) = ^i+^PS-v)o<^r^~^{p,-p,), 

n ^ n 

where are the common roots of = 0 and 2 Piy”'~^ = 0, with multipHoities 

i=0 i=il 

hi and respectively, and where the are polynomials whose order does not exceed either 
— l orj^ — 1. (They may all be identically zero.) 

n 

The are, of course, also roots of — = 0, with multiplicities given by the 

i*=v 

smaller of and j^. 

We ask nowif it is possible to construct grade 2 alone in such a way that its total remains 
also constant. The argument which has been used in §,3 shows that this is possible if another 
equation of grade n — v can be found whose coefficients say, are not larger than the corre- 
sponding qi (and = 1), which has once again the roots x-^ x^, with multiplicities at 

least. If -H . . . -t j 7 ,„ = n — V, then this is clearly impossible. If -t- . . . -I- is smaller than 
this value, then we can try to find such an equation. The initial population 'can also be then 
found, if we multiply the initial population of grades 2 and above by Wilq^^. The grades 1, 2 
and the aggregate of 3 and above can then be constructed and every stratum kept constant, 
but with changing age distributions. 

We can proceed in the same way and find at each step whether further splitting up is 
possible beyond 3 grades, 4 grades, etc. It is seen that in general, if = n~m, and if 
grade g starts in fact at age S' — 1, then m + 1 grades can exist. 

The smallest value of Zgi is 1, and in this extreme case n grades can be constructed, i.e. 
one less than the number of ages. The «th grade will then contain the ages n—1 and n. 
Further, since is a root of a; — = 0, the age distribution of this highest grade is 

Hi + - {Hi + ccixff’^) Xi = -HiXi- aix[~”^^ 


(*1 is, of course, negative). 

Example. We use again the same example as before. The characteristic equation for the 
whole population was 


and that for grade 1 alone 

+ ilo = 0. 

The difference between these two equations gives the equation for grade 2 and above 

K* + 1*3 + Ifa;* + Ma: + ^ = 0. 

This equation has, of course, the roots ^ which are common to the two 

characteristic equations of the fifth degree, and also a further root — How there'is 

17-a 



260 The stratified semi-statianary population 

a biquadratic equation with the three specified common roots and not larger coefficients 
(and having the coefficient of equal to unity), viz. 

+ + ^ = 0 . 

The fourth, irrelevant, root is - 1/6. This equation leads to the following development: 

Grade 2 only 


t 

0 

1 

2 

3 

4 

6 

6 

7 

8 and after 

Age 1 

73 

43 

60 

62 

49 

60 

50 

60 

60 

2 

18 

00 

35 

41 

43 

40 

41 

41 

41 

3 

6 

8 

26 

14 

18 

19 

18 

18 

18 

4 

5 ' 

11 

4 

~l 

1 



5 



. 2 

3 

3 

3 

3 


112 

112 

112 

112 

112 

112 

112 

112 

112 


Grade 3 and above 


Age 2 

7 

22 

13 

16 

15 

16 

15 

16 

16 

3 

3 

4 

14 

9 

9 

9 

9 

9 

9 

4 

12 

2 

2 

5 

4 

4 

4 

4 

4 

6.. 


1 

— 

— 

1 

— 

1 

1 

1 


29 

29 

29 

29 

29 

29 

29 

29 

29 


Analysis into further grades is impossible in this case, .because the characteristic equation 
of the third grade does not have any roots apart from the three common roots of all previous 
equations. 

6. Pbomotiou- rates dependent on seniority 
We still consider more than two grades, but now we will assume that the promotion rates do 
not depend on the attained age but on the seniority, i.e. on the time spent in the grade, 
instead. In the lowest grade seniority is equivalent to age, because all members were sup- 
posed to enter at the lowest age only. If we consider again the two grades of § 3, but this time 
take note of differences in seniority, we find the following pattern: 


Age 

Lower 

grade 

Higher grade 

Total 

Seniority 

0 

1 


tc— 1 

0 







1 


it-xPiih-h) 





2 







a: 

i>t~xPA 





S ^t—xPx 





■■ 




NoU. tf = PilPt and benoe to = 1. 


















251 


S. Vajda 


If we consider no\r promotion from grade 2 into grade 3, and if we introduce u^, the prob- 
ability of not being promoted during s years from grade 2 (wq = 1)> we see that grades 2 
and 3 (including higher grades, if any) will have the following constitution: 


Grade 2 


Age 

Seniority 0 

1 


'C—l 

1 





2 





X 

Wo 


... 

^t-xVxih-h) 







Grade 3 


Age 

Seniority 0 

1 


a; — 2 

2 

3 

■” ^a) ' ('^0 “ '*'h) 

+ (^o-"^i) (Wx-Wa)] 

^t-zPa{h-h) (“o-Wi) 



X 

~ ^*-l) (Wo — Wj) 

+ (^x-a — ix-a) (Wi — Mj) -f- . . . 

^(-xPxCfe-s — ^i-a) (Wo~Wi) 

-fik-h) K-3-w»-a)] 

. . . 

it-xPxik-h) (Mo -Ml) 



1 1 



It follows by means of the same argument as before that grade 2 can be kept constant if 
we can find the such that the equation 

^Pi(to - h) + - h) + (h - h) «i] + • . • 


rn~l. 


■“ ^n) + (^n-2 ~ ^n-l) Wj -f • . • H- (io . ^l) '^re-i] — 0 
has the same roots which were common to x^+x’^~^Pi+ ... -t-p^ = 0 and 

— ti) -t ^Pz(to ^2) d' • • • "t Pn(io ~ ~ 


which is identical with the dilference of the first two equations of degree n, referring respec- 
tively to the whole population and to the lowest grade. We must further insist that all 
must have non -negative values, not larger than 1 . The coefficients of the powers of x must 
also be positive, but it is not necessary that < u^, unless we do not admit reversions. 
If m is the number of common roofs, then it follows again as in the last section that w - m -f I 
grades could exist which remain constant under the Operation of promotions, but that their 
age and seniority distributions change. 

Example. Dealing once more with the same example as in the previous sections, we have 
to find a biquadratic equation 


H 1 - If ) ||[(ll - fit) +( 1 - ID “1] ^ [(in - H) + (If - tH ) % + ( 1 - ID ^2] * 

+^[(ff"D+(lil-ID%+(lf'-tiD«2+(i-ID%]^ 

+^[(|-D + (ff-l)%+(lif-fD«a+(ll-liD%+(i-'ID« 4 ] = 0, 









252 The stratified semi-stationary population 

or, if we use four significant figures in every fraction, 

a:*4-(0-6417 + 0- 6833%) a;H (O- 1973 + 0- 1 658% + O' 1786%) 

+ (0'01806 + 0-04274% + 0-03693% + O-OSSeOwj) a: 

+ (0 + 0'001389%+0'003288%-f 0-002764^3 +0'002976%) = 0. 

This biquadratic equation must have the roots — and If the fourth root is 

called ( - z), then the equation must be identical with 

(a:3 + |a:HHa; + w)(^'+z) = »- 
Simple arithmetic shows then that 

% = 0-1429 + 1-7142Z, % = 0-0459 + l-9082z, 

% = - 0-1283 + 2-2566Z and % = 0-0019+ l-9962z! 

Now z must be at least 0-05685 to make % positive and it must not exceed 0-6, because 
otherwise the % would exceed unity. But then % will always be larger than u^, unless we put 
K = I which would mean = 1 for all i and then there would be no members at all in grades 3 
and above. It follows that we must admit reversions from grade 3 into grade 2. We can then, 
for instance, take z = 0-2 and have 

%= 0-4857, % = 0-4276, %= 0-3230 and finally 144 =0-4011, 

The biquadratic equation becomes 

a:*+||a;H^a;* + tTa;-l-^io = 0- 

This is the same as the one used in § 4, and we can again write down the changing pattern of 
the population, but this time taking also seniority into account; 


Orade 2 






























S. Vajda 


263 


Orade 3 


Age 

0 

1 

2 

3 

2 

7 








7 

22 




22 

13 


■ 


13 

16 




15 

3 

1 

2 

— 

— 

— 

3 

2 

2 




4 

6 

3 





14 

4 

6 





9 

4 

4 

3 

6 



12 

1 

— 

1 


2 



1 

1 


2 

1 

2 

2 

__ 

5 

5 

1 

2 

2 

2 

B 

B 

— 

— 


1 

1 

— 

B 

B 

— 

— 

— 

— 

1 

— 

— 



B 

B 

2 

B 


25 

2 

1 

1 


19 


B 


29 


7 

2 

1 

29 


4 

5 

6 

7 and later 

2 

16 





>’6 

16 




16 

15 




16 

16 




16 

3 

4 

6 

— 

— 

— 

9 

4 

6 



9 

4 

5 





9 

4 

6 

— 

— 


4 

1 

1 

2 

— 

— 

4 

1 

1 

2 

— 

■1 

n 

1 

2 


4 

1 

1 

2 

— 


6 

H 

1 


H 

H 

B 

— 


— 


B 

B 

1 

— 

— 

1 

— 

1 

— 

— 

1 

■ 


B 

B 

B 

B 



6 

2 


29 


B 

2 

B 



7 

2 

B 

29 


We find, as before, that farther splitting up of grades is impossible, if the total in each 
grade is to remain constant throughout the years. 


Summary 

This investigation deals with a stratified population, which is subject to (i) mortality, 
dependent on age, and to (ii) promotion rates, indicating the ratios of members of a grade 
which are transferred to the next higher grade at the end of the year. 

Section 1 concerns a population which is not yet stratified and formulae are deduced to 
calculate the number of entrants at time t, necessary to replace yearly deaths and thus to 
keep the total of the population constant. This number depends clearly on the mortality 
rates and on the age distribution existing at time ^ = 0. In general the population tends 
towards a limiting age distribution, the ‘intrinsic stationary population’. 

Section 2 considers two populations and conditions are derived for the case that they need, 
every year, equal numbers of entrants to keep them constant. 

Section 3 introduces the stratified population. Both mortality and promotion rates 
depend on the age, and they are independent of the time t. Under certain conditions one of 
the two populations considered in § 2 can be taken as the whole and the other as the lowest 
grade in it. It is shown how and when entries into the grade can, at the same time, replace 
both losses due to mortality in the whole population, and to mortality and promotion 
depleting the lowest grade. This can also be described by saying that the totals of both grades 
can be kept constant at the same time, although the age distributions change from year 
to year. 

Section 4 generalizes the results of the previous section for a population consisting of 
h grades. If the population is spread over n ages, then it is shown that up to n - 1 grades 












































[ 265 ] 


A SIMPLE APPROACH TO CONFOUNDING AND FRACTIONAL 
REPLICATION IN FACTORIAL EXPERIMENTS 

By 0. KBMPTHORNE, Bothamsted Exp&rimental Station 

Intbodtjction 

The design and analysis of factorial experiments was described in 1937 by Yates in consider- 
able detail. In his treatment Yates described first the 2" system and then went on to deal 
with S" experiments and experiments of the 2’“3" type. The 2«- system is capable of very 
easy explanation, but with experiments of higher order both the design and analysis become 
of increasing complexity. It is the purpose of this paper to present a general method by 
which factorial designs of the type may be examined, in respect of both confounding and 
fractional replication. The method will be described by explanation of the rules for the 2™ 
and 3” systems and corresponds quite closely to that given by Fisher (1942). The present 
approach presents confounding and fractional replication as different aspects of the same 
jjroeess. Experimental designs suggested by Plackett & Burman (1946) are also discussed. 

The system 

In this system all combinations of n factors each at two levels are tested. The totality of 
treatment combinations may be represented by the points of an w-dimensional lattice, each 
side being of unit length. Let the factors be aii, *2, ...,*„ and take n mutually orthogonal 
axes ... y,^ The point (000 ... 0) will then represent the control treatment, (1000 ... 0) the 
treatment consisting of at the upper level and all the other factors at the lower level, and 
so on. The treatment effect of x-^ is the difference of the means of the yields of plots receiving 
Xi and those not receiving It is therefore the difference between the mean of the plots 
represented by points lying on the plane j/j = 1 and the mean of those represented by the 
points on the plane = 0. The interaction of and is the difference between the means 
of those plots represented by = 1, = 1 or = 0, 1/2 = ^ those represented by 

2/i = 0, ya = 1 and y^ = 1, yj = 0, i.e. the difference of the means of those plots for which 

2/1+ 2/2 = 2 or = 0 (mod 2), 
and those for which 2/1 + 2/2 = ^ (mod 2). 

Similarly, the triple interaction of x^, and x^ is the difference between the means of those 
plots for which yi + ya + 2/a = 0 (mod 2), 

and those for which 221 + 2/2+^3 =1 (mod 2). 

This process can be continued to the consideration of the interaction of x^, x^, ...,Xj^ which 
is the difference between the mean of those plots for which 

yi+ya+ys+'-'+yn = 0 (mod 2), 
and the mean of those for which 

yi+y2+y3+ +2/n = 1 2 ). 

In the 11-dimensional space parallel hyper-planes may be drawn containing the points of the 
lattice, such that the total yield forming the positive part of an interaction is obtained from 



256 Confounding and fractional replication in factorial experiments 

a. set of parallel hyper-planes equidistant from each other. Likewise the negative part is 
obtained from another set of parallel hyper-planes, each plane of which lies midway between 
two planes of the first set. 


The 3" system 

With n factors at each of three levels the treatment combinations are given by an n-dimen- 
sional lattice, each side being of length two units and containing three points. The treatment 
contrasts may be described as in the 2"' system with some shght modifications. 

Any contrast in the 3” system involves the comparison of three totals of the yields of 3^-i 
plots, and may be represented by the comparison of the differences between the yields of the 
plots lying on three sets of parallel hyper-planes. For example, if n = 2 the lattice is as 
follows: 

0 1 2 

0 

2/2 1 

2 


The main effect of aJj is the difference between the totals of yields of plots for which y-^ = 0, 
yi = I and 2/i == 2. The I component* of the interaction of and is the difference between 
the totals of the yields of plots for which 

2/i-2/2 = 0. 2/i-^ 2=1. and y^-y^ = 2. 

The J component is given by the contrast between the yields of plots for which 
2 /i + 2/2 = 0> 2 /i + 2/2=1. and 2/i-l-2/2 = 2. 

Anticipating the extension to cases when n is greater than 2, the equations for the I 
component may be written as follows: 

A:iZ2(/o): yi + 2y8 = 0 (mod 3), 
yi-f2y2=l (mod 3), 
yi-i-2i/2 = 2 (mod 3). 

If aji and (and therefore and are interchanged, then X^X^il^) is given by the 
equations 2/2-1- 2^1 = 0, XfK..fI-^) + = 1, X^X^il^) by 2/2-1-22/1 = 2, all mod 3. But 

the equation 2/2 + = 0 (mod 3) is identical with the equation 2/1 + 22/2 = 0 (mod 3), since 

32/1+82/2 = 0 (mod 3), whatever the values of 2/1 and 2/2; ^2X1(7^) is therefore equal to 
XiX 2(7 o)- Subtracting the equation 2/2+22/1 = l (mod 3) from the equation 

82/1+ 82/2 - 0 = 3 (mod 3), 

we get 2/1 + 22/2 = 2 (mod 3); X^X^{Ii) is therefore identical with X^Xj^lf). It is obvious 
from the equations given above for the J component that XiX2( J^) = X2,X^(Ji) for i = 0, 
1 and 2. 

* Yates’s terminology for the components of interactions is used where convenient, but it is more 
convenient to refer to Ij, and as J„, and Jj respectively. 




0 . Kempthobne 267 

Considering the case w = 3 , it is easily seen that the second order interaction may be 
split into four parts each consisting of the contrasts between three totals. These may be 
represented by the following equations: 

(I) 2/i+ 2/2+ 3/8 = 0 (mods), 

2 /i + 2/2 + 2/a = 1 
2/1+ ^2+ 2/3 = 2 

(II) 1/1 + 21/55+ 1/3 = 0 (mods), 

2/1 + 2i/a+ 1/3=1 
2/1 + 2^2+ 2/8 = 2 

(III) yi+ i/a + 22/g = 0 (mods), 

2/1+2/2+22/3=1 

2/1+ 2/2+22/3 = 2 

(IV) yi + 2^3 + 2^3 = 0 (mod 3). 

2/1 + 2^3 + 21/3= 1 
2/1 + 2^3+21/3 = 2 

In order these have been named by Yates 

z, z, r, F. 

It is interesting in passing to note the relations between 2 , Z, Y and W for permutations of 
the order of the factors. It is obvious from (I) that Z is invariant for any change in order of 
the three factors Zi, Zj and Zg. Interchanging and 1/3, equations (II) become equations 
(III), so that ABG{X) — AG B{Y). The following interchanges may be easily verified (using 
the equation 31/1+ 81/2+ Si/g = 0 (mod 3 ) where necessary): 

ABC{X) = BCA{Y) = GAB{Y) = AGB(Y) = GBA(X) = BAC(W). 


From the equations, it is clear that Z, X and W may be computed in the way given by 
Yates, since Z ^ Y = J{xi,J(x3,X3)}, 

X = l{x^, I (* 2 , 3 : 3 )}, w = I{x^, J (Xg, Xs)}, 


I(x3, X3) and J{x3, x^) being evaluated for each level of x^. The extension to the case n = 4 is 
again obvious; the main effects, two-factor and three-factor interactions, follow as in the 
above, and the four-factor interaction may be spht into eight comparisons of three totals: 


I 2/1+ 2/2+ ^3+ 2/4=0, 1,2 (mods), 
II 2'i+ Vi + ^3 + 22/4 = 0,1,2 (mods), 

III 1/1+ ^2+ 21/3+ ^4= 0,1,2 (mods), 

IV 1/1+ i/z+ 2i/3+ 2i/4 = 0, 1, 2 (mod 3 ), 
V i/i+2i/a+ 1/3+ 1/4=0, 1,2 (mods), 

VI 1/1 + 21/2 + 2/3 + %« = 0, 1, 2 (mod 3), 
VII i/i + 21/3 + 2^3 + 2/4 = 0, 1 i 2 (mod 3), 
VIII yi+2i/3 + 2i/3+2y4 = 0,l,2 (mod 3). 



258 Confounding and fractional replication in factorial experiments 

As in the case of two factors, the eifect of permutations of the order on the components of 
X, Y, Z and W may be easily obtained. The four-factor interactions may be computed by 
putting the equations given above into the following form: 


I — d{xi, 2}, 
II = JK, Y}, 

III = 

IV = J{x„W}, 


Y = I{x„ If}, 
VI = I[x„X}, 
VII = I{x^, Y}, 
VIII = l{xi, Z}, 


where the three ooniponents of W, X, Y, Z of x^, (in that order) are evaluated for 
each level of x^. 


The jp" system 

The total of 1 degrees of freedom, where p is a prime, in the analysis of variance of 
a p“ experiment may be split into (p’^— l)/(p— 1) sets of (p - 1) degrees of freedom, the 
contrasts being given by the following hj^er-planes: 

^1 = 0,1,2,. ..,p~l, 

P2 = 0,1,2,...,p-1. 

Main effects (mod p). 


2/3, = 0,1,2, ...,p-l. 


Interactions of pairs of 
factors, e.g. of and x^ 


2/i+ 2/2 = 0)1.2, ■.■,p-l, 

Pi+ 2?/2« 0,1,2 p-1, 

(mod p), 


^yi + (35-l)^2 = 0,1,2 p-1. 


and so on to the interaction between aU the factors which is given by the hyper-planes 
«i 2 /i+a 22 /a + “32/3+---+a,iP„ = 0, 1 , 2 , ...,p- 1 (modp), 
where equals 1 and a^, Uj, . . . , each may take all values from 1 to p — 1 . 


Simplification of notation 

The p"- 1 degrees of freedom in the p” system may be split into (p" — l)/(p — 1) sets of 
(p-1) degrees of freedom, given by the above hyper-planes, but it is only necessary to 
specify one hyper-plane of each set of the parallel hyper-planes. 

All the comparisons may be denoted by •••,2/n", fk® symbol meaning that the 

comparisons are given by the hyper-planes 

“i 2 /i + « 22 / 2 +--.+a„p„ = 0,1, 2,..., p-1 (modp). 

In order to obtain an enumeration which covers all the possibilities once and once only, it is 
necessary to use the rule that the factors are always written down in ascending order — 
i.e. Vi'y’jjylK, etc,, such that i <j<k... and that = 1. 



0. Kempthorne 


259 


The system in the revised notation 

As an example, the 3® system will be examined in detail. The effects are represented by 
2 /i, 2 /a. 2 / 3 ; interactions between pairs, yxyz,yxy\,y^y^,yxyl,y^yz,yiy\\ interactions between 
all three factors, yiy^Vz, yxy^yl, ViVlyz^ other combination of powers of the i/’s 

can be reduced to the above set. 

It is interesting to examine the interactions of the effects and interactions. In the case of 
the 2"' system, Yates refers to the generalized interaction of two interactions ABGD and 
GDE say, which is ABE. The interaction of effects or interactions A and B consists of A B 
and AB^ in the 3“ system. 

(а) The interactions of main effects are obviously interactions between pairs of factors. 

(б) The interactions of main effects and two-factor interactions with one letter in common 
are two-factor interactions and main effects: e.g. the interactions of y^ and y^y^ are 

2/12/2 = 2 /i 2 /i. and yfyl = 

and the interactions of and y^yl are y|y| = y^yg, and yfyj = y^. 

(c) The interaction between main effects and three-factor interaction are two-factor and 
three-factor interactions: 


Between 

Interactions 

3/i and 

ylViVi ~ 2/i2/aJ/?. y \ yly \ - 2/22/3 

yim & yiy^yl 

ylyiy \ = yiyly 3 > ^ tylyt^Vsyl 

Vi and Viylyz 

lAylyz = 2/12/22/3. 2/i2/a2/a = 2/22/1 

Vi and y :. y \ y \ 

2/i2/a2/3 = 2/i2/a2/3. 2/12/^2/3 = 2/22/3 


(d) The interaction between two-factor interactions are exemplified in the following table : 


Between 1 

! 

j Interactions 

yiya and y^yl 

y\yl=y\> 

y\yl = 2/2 

yi 2/2 and y^ys 

2/i2/l2/3. 

2/1 2 / 1 2/1 = 2/1 2/1 

2/1 2/2 and y^yl 

2/i2/l2/l. 

ViVlyi = 2/i2/3 


(e) The interaction between two-factor and three-factor interactions are exemplified in 
the following table; 


Between 

yxVi 

1 

ViVl 

2 / 12 / 22/3 

ViV^yl 

2/3 

2/i2/l 

2/22/1 

ViViVl 

ViViVs 

2/3 

2 / 1 2/3 

ViVi 

yxvlv-i 

S'! 2/1 

2/2^3 

Vivlvl 

2/3 

Vi'i^yl 

ViVi 

y^yl 

vivly^ 

2/3 



260 Confounding and fractional replication in factorial experiments 

The interactions between two-factor and three-factor interactions are therefore two-factor 
interactions in some oases and main effects and three-factor interactions in the other cases. 
(/ ) The interactions between three-factor interactions are set out in the following diagram : 


Between 

2 / 12 / 22/3 

yiy%y\ 

2 / 12 / 22/3 

2 / 12/2 2/1 

2 / 12 / 22/3 

— ■ — 1 

2/i2/a. 2/3 1 

22i2/3. 2/2 

2 / 1 . 

2 / 22/3 

ViVitA 


1 

yi> VzVl 

2/i2/l. 

Vz 

2 / 12 / 22/3 



— — 

2/i2/2i 

2/3 

y\y\y\ 




— 



CONFOTINDING 

Confounding or the allocation of treatment combinations to blocks implies the allocation of 
all the points of the lattice into sets, of points, such that the comparisons between 
these sets involve particular sets of p - 1 degrees of freedom. The aim of confounding is to 
reduce the effect of soil heterogeneity by reducing block size, but ensuring that the block 
comparisons have little possible practical importance. 

If comparisons A = yi\y%^ ■■•2/2" and B = ■■•yi” are confounded, then so is their 

generalized interaction, i.e. all the products of these two, i.e. AB, AB^ ABp~^. For, if 

the treatment combinations for which a^y^+a^y^A ..■+a.^y^ is equal to 0, 1, 2, ...,p- 1 
are put into separate blocks and also those treatments for which /9i2/i + + ■ ■ ■ + PnVn is 

equal to 0, 1 , 2, . . . , p - 1 , then (ai + A,di) y^^ + (ota + 2/2 + ■ ■ . -t- -t- A/?„) y^ is equal (mod p) 
to 0,1,2, for all A from 0 to p — 1. 

The present approach to confounding of the 2™ system is identical with that given by 
Yates and we proceed to consider the rather more complex case of the 3" system. 

(a) 3^ system 

(1) In blocks of 3^. Any three-factor interaction may be confounded. 

(2) In blocks of 3. We cannot confine the conformded degrees of freedom to three-factor 
interactions because the generalized interaction of any two reduces to a two-factor inter- 
action and a main effect. If two three-factor interactions, y-^y^y^ and yxy\y^ are confounded, 
the 8 degrees of freedom for blocks may be described as follows: 

D.r. 


y% 2 

Vi2/3 2 

ViViyz 2 

yxvly^ 2 

8 


We can, however, choose three two-factor interactions and one three-factor interaction pair 
for our block comparisons. 



0 . Kempthorkte 


261 


(6) 3* system in blocks of 3® 

It is immediately obvious that we can confound two two-factor interactions and two 
higher-order interaction pairs to give blocks of nine. The important point, however, is to find 
a design confounding only three-factor interaction pairs. 

We therefore evaluate the interactions of all pairs of three-factor interactions, which have 
two letters in common. These may be derived from the interaction of yxy^Vz '"dth the four 
three-factor interactions of and 1/4, which are as follows: 


Interaction of y^y^y^ 

and 

y^yzVi 

yiy^ylyl 

and 

2/32/4. 

Interaction of y^y^y^ 

and 

yiViVl 

yiViVlyi 

and 

2 /s 2 / 4 . 

Interaction of yiP^y^ 

and 

ViVlyi 

yivlyl 

and 

2 / 2 ^ 32 / 4 . 

Interaction of y^y^Vz 

and 

yivlvl 

yxvlVi. 

and 

2/22/32/i 


Obviously there are many designs for the 3* design in nine blocks of nine plots confounding 
three-factor interactions. Those which confoimd four-factor interactions must also confound 
two-factor interactions. The names of the confounded interactions and their squares (each 
of which corresponds to the same grouping as the element itself) form a group with the 
identity and the equation y\= I, for all i, and further work is presumably most promising 
on these lines. 


(c) 3® in blocks of 9 

There is no design confounding only three-factor or higher-order interactions. If one two- 
faotor interaction can be sacrificed, a possible scheme of confounding is given by the 
following table of generalized interactions: 


Between 

2 / 12 / 22/3 

Viylyl 

Vivlyi 

2 / 32 / 12/1 






yiViVi 

2/1 2/2 2/1 2/4 2/6 

2/1 2/2 2/1 2/1 

^12/22/32/5 

2/12/22/32/42/5 

2/12/42/6 

2/1 2/3 2/6 

2 / 22/6 

2 / 32 / 32 / 42/1 


This two-factor interaction is estimated by the comparison of three sets of nine blocks, 
and the accuracy of the estimate will be low. 

[d) 3® in blocks of 27 

We may, for example, confound the following; 

ViViVs VivlyzyiVi viysvlyl 
ViViVs yiyzViyzyt yiyly^yl 
Viylytylyl Viylvlyz yxyzylyivlyl 
y^yivlvl VtiHyzyi VzViyhl 

Three three-factor interactions, six four-factor interactions, three five-factor interactions 
and one six -factor interaction are confounded. If yj is omitted from all the above expressions 



262 Confounding and fractional replication in factorial experiments 

■we obtain a 3® experiment in blocks of nine confo'unding one two-factor interaction, seven 
three-factor interactions, three four -factor and two five-factor interactions — that is, the 
design given above for the 3® system. 

Extension to moee complicated cases 

Extensions of the above to more complicated cases should most easily be achieved by the use 
of group theory. The confounding of a design in blocks corresponds to a group of 
+ 1) elements such that all except the unit element involve at least a certain number of 
letters. Eor most agricultural experiments each element should contain at least three letters, 
so that no main effects or two-factor interactions are confounded. The group is an Abelian 
group and if A and £ are elements of the group so are A A, AB^, . The order of each 

element is p, and if A is an, element so are the first (p> — . 1 ) powers of A . This aspect is being 
followed, and it is hoped will yield results. 

Fractional replication in the 2” system 

Some principles of fractional replication have been worked out over the past few years at 
Rothamsted (Finney, 1945). In the case of a 2"- system, with factors ... say, a half- 
replicate might consist of those treatment combinations which form the positive part of the 
interaction AiA^.-.A^' Each function of the plot yields consisting of the sum of one-half 
of them minus the sum of the other half then corresponds to two degrees of freedom. 
Alternatively, each degree of freedom has one alias, and the aim in fractional replication is 
to design the experiment so that the aliases of effects which the experimenter wishes to 
measure are high-order interactions which could not possibly have practical significance. 

For convenience of presentation, we develop first the theory for the case of the 2^ system. 
Suppose that of ail the points on the lattice for the 2’^ system, only those points for which 

yi + S'2 + y3 + ---+yn = 0 

are included in the experiment. Then the points on the hyper-plane = 0, also lie on the 
plane ... -l-y„ = 0, and likewise those for which 2/1 = 1 lie on the plane 

2/2 + J/3+ •••+2/n. = 1- 

The contrast which we have denoted by is therefore identical with that denoted by 
HiUsVi ■••Un- Again, if we suppose that only those treatment combinations are tested which 
lie on the hyper-planes 

+ ••• +«»?/« = 0. A2/1 + ^2/2+ ••• = 0. 

then the points will also lie on the intersection of these planes which is given by the equation 

(“i + A)2/i+(“2 + A)«/2+--- + (a„ + y5j = 0 2). 

The points which lie on the hyper-planes 

yi?/i+r22/2+---'4-y„2/H = 0.1 (mod 2 ) 

■vv^ill also lie on the planes 

K + yi)2/i4-(a2-H7,i)2/2 («„ + y J ?/» = 0, 1 (mod 2), 

(A + yi)2/i + (i52-t 72 ) 2/2 -f.-.-f iPn-t-yflVn - (mod 2), 

(ai + A + yi) j/i + (“2 + A + y 2 ) 2 / 2 + • • • + {<^n + A + y«) i/n = o, i (mod 2) . 

Changing to the simpler notation, these results may be obtained by equating to unity the 
symbols corresponding to the effects which the experiment cannot measure (as only treat- 



0. Kempthobnb 263 

ment combinations of the same sign in the function giving the effect are included) and 
multiplying the symbol corresponding to a particular effect by these 83nmbols. Thus we put 

1 = yl^yt^ • • • 2/^ = ■ 2/C” = 

then the contrast yVvl^ ■■■ 2/X” 

is the same as those given by 

2/«i+yiy“a+r2. . , 2/“n+yn, ^/{i+yi^/A+ya . . . ?/^»+y» and yli+P^-^yi-yi^+h+y ^ ... 2/«„+^„+yn, 
where each power is reduced modulus 2. 

2” SYSTEM WITHOUT SUBDIVISION INTO BLOCKS 
We now consider some of the possibihtiea of partial replication for the 2” system. The basis 
of designs with fractional replication is the choice of an identity relationship; most of the 
possible relationships are of no value, and we consider only those which yield the least 
possible confusion between main effects and first-order interactions. 

Half -replication 

w = 3. If we take I = y^y^y^, then y^ = y^lp-^y^yf) = yly^y^ = y^y^. Such a design which 
confuses main effects and two-factor interactions would not be of any practical use, 
w = 4. If we take I = 2/12/22/32/4, then the aliases are exemplified by 

2/1 = 2/22/3^4 and 2/12/2 = 2/32/4- 

Such a design would not be used unless the experimenter were confident that two-factor 
interactions were negligible. 

n = 5. If we take I — 2/i2/22/a2/42/5> abases are exempMed by 

y\ = y%yzy\yh and 2/1^2 = 2/3^42/5- 

A lialf-replicate with five or more factors is feasible when there is no necessity to remove 
heterogeneity by the use of blocks, since main effects will have aliases which are interaotipns 
of four factors at least, and two-factor interactions will have aliases which are interactions of 
at least three factors. 

Quarter-replicaUon 

Each degree of freedom will now have three aliases. For each value of n we give the identity 
relationship and t3xpical alias relationships. 

M, = 4. I = yi2/2 = VaVi = 2/12/32/3^4; 

then 2/1 = ^3 = 2/12/32/4 = 2/3^32/4 and 2/12/3=2/32/3=^12/4 = 2/32/4- 

n = 5. I = 2/i2/s = 2/32/42/s = yiVzyzyiVz (a), 

or I = 2/12/22/3 = 2/32/42/5 = 2/12/3^4^5 {^>)- 

(a) Gives 2/1 = 2/2 = 2/i2/32/42/5 = ViVayiVi and i/il/s = VzVa = 2 /i 2/42 /b = 2/32/42/6- 

(b) Gives 2/i = 2/22/3 = 2 /i 2/32/42 /b = 2/22/4^5- 

m = 6. I = 2/i2/32/s2/4 = 2/32/42/52/6 = 2/i2/22/52/6i 

then 2/1 = 2/32/32/4 = 2/12/32/42/52/6 = 2/32/52/6 

and 2/12/2 = 2/32/4 = yiViysyiVaVa = 2/52/6- 

n = 7. 7 = 2/12/32/32/4 = 2/42/52/6^7 = 2/12/32/32/52/62/7; 

then 2/1 = 2/32/32/4 and 2/1^3 = 2/32/4- 

«, = 8. 7 = 2/12/22/32/42/5 = 2/42/52/62/72/8 = 2/i2/22/82/62/72/8; 

then 2/1 = 2/22/32/42/5 and 1/12/2 = 2/32/42/6- 

Designs in quarter replicate are therefore possible when n is greater than or equal to 8. 

Biometrika 34 x8 



264 Confounding and fractional replication in factorial experiments 

HiGH-OBDBB FBA.CTIONAI, BBPLIOATION 

In general, the existence of fractional designs of the 2^ system with fraction 2^, which will 
be useful where information on all main effects and two-factor interactions is required, 
depends on the existence of a group of 2» elements, one element being unity and the other 
elements all containing at least five letters. No simple method has been found of enumerating 
such groups, but it is perhaps worth recording the following designs which appear to represent 
the greatest degree of fractional replication possible. 

(a) Eighth replication 

If we are testing ten or more factors at each of two levels, one-eighth of a replication will 
enable main effects and two-factor interactions to be estimated. An appropriate identity 
relationship is the following; 

I = yiyvtyaViyR = yiVayeViya = yaViVayay-^ya 

= 2/i2/32/72/a2/io == 2/2 2/42/6 2/? i/s 2/io = 2/2^32/62/82/92/10 =,2/i2/42/62y6 2/82/92/io- 

Thus ten main effects and forty-five two-factor interactions may be estimated from a trial 
testing 128 of the 1024 possible treatment combinations. 

(6) Sixteervth replication 

If we are testing twelve or more factors a possibiS' identity relationship is the following; 

^ = 2/i2/22/32/42/s = 2/12/22/62/72/8 = i/sS'iJ'si/ei/vi/s = 2/i2/22/92/io2/ii 
= 2/32/42/52/92/102/11 = 2/82/72/82/92/10^11 = yi2/22/32/42/52/e2/72/82/92/io2/ii 

= yi 2/3^6 2/9 2/12 = VayiViVtiy^yn = yaVaViyayayra = 2 / 12 / 42 / 52 / 72 / 82 / 92/12 
= 2/22/32/62/io2/ii2/i 2 = 2/i2/42/52/62/io2/ii2/i2 ~ 2/i^s2/72/82/io2/ii2/i2 2 / 22 / 42 / 52 / 7 2 / 82 /io 2 /ii 2 /i 2 ' 

In this case twelve main effects and sixty-six two-factor interactions may be estimated from 
a trial testing 256 of the possible 4096 treatment combinations. 

The extent to which these designs will be of practical value depends very much on the 
existence of a sufficient mass of reasonably homogeneous material to test the large number 
of treatment combinations without the necessity of dividing the material into smaller 
batches and using the device of confounding. An experiment involving say 256 different 
treatment combinations is not large by modern standards. At Rothamsted, for example, an 
experiment involving 200 distinct treatments on 300 plots has been carried out for some 
years: this experiment was, however, made possible by utilizing the elimination of the 
offect.s of soil heterogeneity by highly complex confounding ; the design, in fact, consisted of 
three 6x5 lattice squares necessitating seventy-five plots, and each of these plots was split 
into four subplots. The advantages of testing twelve factors, say, at the same time under 
virtually the same experimental conditions cannot, however, be ignored. Such an experi- 
ment should have more value, other things being equal, than two distinct experiments each 
testing some of the factors. An examination has not been made of the possibilities of reducing 
block size by confounding for the above two designs, but it is probably necessary to sacrifice 
a few twm-factor interactions. 

The bebatiohsiiib between fbactional bbplication and coneohnding 

It is clear that fractional replication and confounding are different aspects of the same 
process. A 2’'' design of 2J> blocks may be described as a 1 in 2^* replicate of a 2^+^ design 
with no subdivision into blocks, by regarding the blocks as a 2^ system in p factors. As an 



0. Kempthobne 


265 


**. 


example, consider the 2 ® design in y-^, y^, y^, y^ and y^ laid out in four blocks ef eight and 
confounding y^y^y^, y^yiys yiy^y^ybi superimposing two pseudo-factors 6 ^ and b^, the 
experiment is a quarter-replicate of a 2 ’ design in y-^, y^, y^, y^, y^, b-^, b^. The identity on which 
the quarter replicate is based is given by the equations 

h = yiy ^ ys , h = 2/32/42/5. hh = 2/12/22/42/5 
or the equation I = y^y^y^b-^^ = y^y^y^b^ = yiy^y^y^b^b,^. 

If we examine this equation in the same way as in the previous sections, we find that the 
design depends on the fact that the aliases of the following type may be ignored: 

2/1 = 2/22/8^1 = S'12/32/42/5^2 = 2/22/42/5^^2. 

2 / i 2/2 = 2 / 3 ^»i = yiy^yayiVsh = y^y^hK 

This example is worth pursuing. The design is frequently used with one replication only, 
the error being estimated from three-factor and higher-order interactions. We set out below 
the identity and 31 degrees of freedom together with all their aliases and their usual place 
in the analysis of variance — blocks {B), treatment (T), or error [E). For convenience of 
printing we denote the factors tested in the experiment by a, b, c, d, e instead of 2/12/22/32/42/5 
and the block factors by x and y. Capitals are used for treatment effects thus conforming 
to present usage. 


I 

= ABOX 

-GDEY 

= ABDEX Y 


A 

= BOX 

= ACDEY 

= BDEXY 

T 

B 

= AOX 

= BGDEY 

=ADEXY 

T 

AB 

= 0X 

= ABODEY 

= DEXY 

T 

G 

= ABX 

= DEY 

= ABODEXY 

T 

AC 

= BX 

= ADEY 

= BCDEXY 

T 

BO 

=:AX 

= BDEY 

= AODEXY 

T 

ABG 

= X 

= ABBEY 

= ODEXY 

B 

D 

= ABODX 

= OEY 

= ABEXY 

T 

AD 

= BCDX 

= AOEY 

= BEXY 

T 

BD 

= AGDX 

= BGEY 

:=AEXY 

T 

ABD 

= ODX 

- ABGEY 

= EXY 

E 

GD 

= ABDX 

= EY 

= ABOEXY 

■ T 

AGD 

= BDX 

= AEY 

= BCEXY 

E 

BOD 

= ADX 

= BEY 

= AOEXY 

E 

ABGD 

= DX 

= ABEY 

= CEXY 

E 

E 

= ABOEX 

= ODY 

= ABDXY 

T 

AE 

= BOEX 

=:AGDY 

= BDXY 

T 

BE 

= ACEX 

= BODY 

= ADXY 

T 

ABE 

= OEX 

= ABGDY 

= DXY 

E 

OE 

=zABEX 

= DY 

= ABGDXY 

T 

ACE 

=:BEX 

= ADY 

= BGDXY 

E 

BGE 

= AEX 

= BDY 

= AODXY 

E 

ABOE 

= EX 

= ABDY 

= ODXY 

E 

DE 

= ABODEX = OY 

= ABXY 

T 

ADE 

= BODEX 

= AOY 

= BXY 

E 

BDE 

= AGDEX 

= BGY 

= AXY 

E 

ABDE 

= ODEX 

^ABGY 

= XY 

B 

GDE 

= ABDEX 

= y 

= ABOXY 

B 

AGDE 

= BDEX 

= AY 

= BGXY 

E 

BODE 

= ADEX 

= BY 

= AOXY 

E 

ABODE = DEX 

= ABY 

= OXY 

E 


If we take for each linear function of the yields the alias involving the smallest possible 
number of letters, but remembering that x, y are pseudo-factors, so that X, Y and XT are of 



266 Confounding and fractional replication in factorial experiments 

equal importance and therefore ZY should be regarded as a main effect and not an inter- 
action, we have the following allocation of contrasts to the three components of the analysis 
of variance: 

Blocks: Z, 7, XY. 

Treatments: A, B, 0, D, E. 

AB = OX, AG~BX, BG = AX, 

OD = EY, BE = GY, GE = D7. 

, AB, BB, AE, BE, 

Error: AY, BY, BX, EX, AXY, BX7, OXY, BXY, BXY, AOD, BGD, AGE, BCE. 

The four three-factor interactions could equally well he regarded as interactions between 
two-factor interactions and blocks. It would be anticipated that these would be smaller 
than the interactions of main effects and blocks. The purpose of the present exposition is 
to give a clear statement of the possible interpretations of the results of an individual 
experiment., Further remarks on the problem of interpretation are postponed to a later 
section in the paper. 

An example of fractional replication with confounding 

A design which has proved of practical utility is the half-replicate of a 2 ® experiment arranged 
in four blocks of eight plots. 

Call the factors y^, y^, 2/3, 2/4, y^, 2/3. Then the best confounding is that in which, using full 
replication, the block differences are all third-order interactions, say 

yiyMi> ysViysVa and y^y^y^y^. 

But it is impos-sible to keep main effects and interactions clear with this confounding, 
whatever interaction is equated to the identity. 

If we take the confounded interactions to be of the type 

VlViys, ^ 32 / 4 ^ 6 - ^ 1 ^ 22 / 4 ^ 5 . 

and the interaction 2/12/22/32/42/52/6 l^e unity, then the following interactions are also 

confounded: , 

2 / 42 / 82 / 6 . 2 /i 2 /a 2/6 and y^y^. 

It will be found by enumeration of the possibilities that one first-order interaction must 
be sacrificed. All main effects and the other first-order interactions will have high-order 
aliases. 

It is interesting to examine this design in the same way as the 2 ® above for the relations 
between block-treatment interactions and treatment interactions. 

There are, in fact, only thirty -two independent contrasts, and it is simplest to enumerate 
these by operating on the identity relationship with the thirty -two possibilities for the 2 ® 
system omitting y^. As before, we insert block pseudo-factors. For simplicity of printing we 
use A, B, G, D, E, F for the factors and X, Y for the block factors. Then 

I^ABGDEF, X==ABO, Y == CDE, XY ABDE, 

and combining these into one relationship, we have 

I = ABCDEF = ABQX = OBEY = ABDEXY = DEFX = ABFY = GFXY. 



267 


0. K.BMPTHOBNE 


A complete table of the aliases for this design follows: 


I 

= ABODEF 

= ABGX 

~DEFX 

^ODEY 

= ABFY 

-ABDEXY 

= 0FXY 


A 

= BODEF 

= B0X 

= ADEFX 

= AGDEY 

r=BFY 

= BDEXY 

= AGFXY 

T 

B 

= AODEF 

= AGX 

= BDEFX 

= BGDEY 

=zAFY 

= ADEXY 

^BOFXY 

T 

AB 

=zGDEF 

= ax 

^ABDEFX 

-ABGDEY 

= FY 

= DBXY 

= ABOFXY 

T 

0 

= ABDEF 

= 

^GDEFX 

= DEY 

=:AB0Fr 

= AB0DEXY 

= FXY 

T 

AO 

= BDEF 

= BX 

= ACDEFX 

-ADEY 

-BGFY 

= BODEXY 

^AFXY 

T 

BO 

==ADEF 

= AX 

= BODEFX 

= BDEY 

= AGFY 

= AGDEXY 

= BFXY 

T 

ABO 

= DMF 

= X 

-ABODEFX 

= ABBEY 

= OFr 

= ODEXT 

= ABFXY 

B 

D 

= ABGEF 

= ABGDX 

-EFX 

-GEY 

= ABDFY 

-ABEXY 

= CDFXY 

T 

AD 

^BGEF 

= BQDX 

= AEFX 

~ACEY 

= BDFY 

=:BEXY 

= A0DFXY 

T 

BD 

= ACEF 

= ACDX 

^BBFX 

=^BGEY 

= ADFY 

= AEXY 

= BGDFXY 

T 

ABD 

= 0EF 

= GDX 

- ABEFX 

=: ABOEY 

= DFY 

= EXY 

^ABGDFXY 

E 

CD 

==ABEF 

= ABDX 

=^GEFX 

= EY 

= ABGDFY 

^ABOEXY 

= DFXY 

T 

AOD 

= BEF 

= BDX 

==ACBFX 

=:ABY 

= BODFY 

= BGEXY 

= ADFXY 

E 

BOD 

= AEF 

= ADX 

= BGEFX 

=:BEY 

= 40DFF 

= AOEXY 

= BDFXY 

E 

ABOD 

= EF 


= AB0EFX 

^ABEY 

= CDFY 

=:GEXY 

= ABDFX7 

T 

E 

= ABODF ■ 

= ABGEX 

~DFX 

^GDY 

= ABEFY 

= ABDXY 

= OEFXY 

T 

AE 

= B0DF 

= BGEX 

==ADFX 

=^AGDY 

= BEFy 

-BDXY 

= A0EFXY 

1 

BE 

= AQDF 

=^AOEX 

= BDFX 


= AEFY 

r=:ADXY 

= BOEFXY 

T 

ABE 

= CDF 


-ABDFX 

=:ABGDY 

= EFY 

= DXY 

= AB0EFXy 

E 

CE 

= ABDF 

= ABEX 

^QDFX 

==DY 

= ABGEFY 

= ABODXY 

= EFXY 

T 

AOE 

= BDF 

-BEX 

= AODFX 

==:ADY 

= BCEFY 

= BGDXY 

^AEFXY 

E 

BGE 

^ADF 

=zAEX 

:^BaDFX 

=:BDY 

^AGEFY 

= AODXr 

==BEFXY 

E 

ABGE 

^DF 


= ABCDFX 

=zABDY 

= 0EF¥ 

^ODXY 

^‘ABBFXY 

T 

DE 

=:ABCF 

=:ABCDEX 

'=:FX 

a OF 

= ABDEFY 

=:ABXY 

= ODBFXY 

T 

ADE 

= BCF 

= BODEX 

=:AFX 

= 4CF 

= BEEFY 

=:BXT 

=:A0DEFXY 

E 

BDE 

=ACF 

= aodex 

~BFX 

= BOY 

= ADEFY 

ziAXY 

= BCDEFXY 

E 

ABDE 

= 0F 

~0DEX 

= ABFX 

= ABCY 

= DEFY 

= XY 

= AB0DEFXY 

B 

ODE 

= ABF 

= ABDEX 

^OFX 

»=y 

= ABCDEF7 

s ABCXY 

:=DEFXY 

B 

AODE 

= BF 

= BDEX 

= AOFX 

= AY 

= BODEFY 

-BOXY 

-ADEFXY 

T 

BODE 

~AF 

= ADEX 

= BCFX 

= BY 

= ACDEF7 

= A0XY 

= BDEFXY 

T 

ABODE = F 

= DEX 

= ABCFX 

= ABY 

= ODEFY 

= GXY 

= ABDEFXY 

T 


of variance which would generally 

D.r. 

3 
6 

14 

J 

31 

The table of aliases is condensed below by the omission of all aliases involving more than 
two factors — counting, as before, AT as a single factor as well as X and Y. 

Effects A, B, D, B have aliases of at least three letters, but 0 = BXY and F = CXY. 

Effects AD, BD, AE, BE have aliases of at least three letters, but 

AB = CX=^FY, AG = BX, BG = AX, GD^BY, BE ^ DX, 

0E = DY, DE = FX^OY, DF ^ EX, BF^AY, AF = BY. 

In an experiment in which block-treatment interactions cannot be assumed to be negligible 
in relation to the effects it is desired to estimate, the interpretation of most two-factor 
interactions is difficult if not impossible. The following identities of practical interest exist 
for the terms which would be used to estimate the error: ACD, BOD, AGE, BGE have 
aliases of three letters and are either three-factor interactions or interactions between blocks 
and two-factor interactions, but ABD = EXY, ABE = DXT, ADE = BXY, and 
BDE = AXY. 


The partition of the degrees of freedom in the analysis 
be made is the following: 

Blocks 

Treatments: Main effects 
Interactions 

Error 



268 Confounding and fractional replication in factorial experiments 

This design is very similar in result to the fully replicated but confounded 2^ design 
described above. 


FeACTIOSTAJL replication in the 3" SYSTEM 

Here we have to consider treatment effects assessed from powers of one-third of a complete 
replicate, Only those treatment combinations represented by points of the lattice lying on 
the hyperplane + (x^y^ + ... + a.^y^ = 0 , or 1 , or 2 (mod 3 ) 

will be included in a one-third replicate. 

A particular treatment effect is given by the differences between the means of those plots 
represented by points on the following three planes; 

+ + = 0 (mod 3 ), 

A2/1+ Aa2/2+ - = 1 (mod 3 ), 

A2/1 + Ay2+ ••• + A» 2 /n = 2 (mod 3 ). 

It is obvious that the points lying on the first plane will also lie on the planes 

(A + Aai) 2/1 -p (A + Aaa) -t- . -t- (A + Aa„)y„ = 0 (mod 3 ), for A = 1 and 2; 

the points on the other two planes will lie on these planes with 1 and 2 respectively on the 
right-hand side of the equation. 

The aliases of each pair of degrees of freedom are therefore obtained by multiplication of 
its symbol by 
and by its square. 

As an example, suppose a third replicate of a 3 ® design is based on the inclusion only of 
those treatment combinations represented by the symbol 2/12/22/3(2/1-!- 2/2 + 2/3 = 0 say), then 
the abases are exempbfied by the relationship 2/1 = y^ylyl — 2/22/3. 

The confounding of one beplioate of a 3® experiment in three 
BLOCKS OF NINE PLOTS 

A frequently used design is the 3 ® in three blocks of nine plots, testing all combinations of 
three factors each at three levels. This design is formally a one-third replicate of a 3 * design. 
Suppose the factors are 2/1, 2/3, and 2/3 and let blocks be denoted by the pseudo-factor 6; 
a three-factor interaction of 2/1, 2/2, and 2/3, say 2/12/22/3, is usually confounded in order to keep 
main effects and first-order interactions free of block effects. 

Then b == y^y^y^ or 7 = y^y^y^b'^, since 6* = 1. 

As in the case of the 2® design, we work out the aliases of each pair of degrees of freedom : 
each pair of degrees of freedom will in this case have two aliases : 


Vi^yivlylb =2/22/3*’* 

2/22/3 = 2 /i 2 /i 2 /l** 

= y ^ b ^ 

2/2 = 2 /i 2 /i 2 / 3 *i = 2/12/3** 

2 / 22/1 = ViyW 

= 2 /i 2 /i** 

Vi = 2/12/22/3** = 2 /i 2 /a** 

2/12/22/3 = 2/12/22/3* 

= b ^ 

2/12/2 = 2/12/22/3* =^ 3 ** 

2/1^22/1 = 2/12/2* 

= 2/3* 

2 /i 2 /i = 2 /i 2 /i* = 2/22/3* 

2 /i 2 /i 2/3 = 2/12/3* 

= 2/2* 

2/12/3 = 2/12/22/3* =2/2** 

ViVlyl = 2/1* 

= 2/22/3* 

2 /i 2/3 = 2 /i 2 /l* = 2/22/i** 





269 


O. Kbmpthobne 

Here again the identities could result in difficulty in interpretation — as of course could 
have been, predicted from the examination of the possible arrangements in blocks of nine 
of the 3* design. The main effects may be regarded as clear, and three of the first-order 
interactions. The remaining two-factor interactions could be ascribed to differential effects 
of the factors on the three blocks, The three-factor interactions which are not confounded 
with blocks are also ascribable to interactions of main effects and blocks and may therefore 
be used to form an estimate of the error of these effects. 


General remarks on ooneoitnding 

The device of confounding is used almost without exception in agricultural experiments in 
order to reduce the block size to twelve or less plots. As the above results indicate there are 
two aspects which then need careful consideration, {a) the estimation of interactions, and 
(b) the estimation of the experimental error. 

The main purpose of the factorial design is the estimation of main effects and interactions 
between pairs of factors and thence of the effect of any one factor in the presence and absence 
of each of the other factors. It is clear that when it is necessary to remove soil heterogeneity 
by confounding, the interpretation of a small experiment involving a few factors may be 
exceedingly difficult because of the possibility of block-treatment interactions. It is possible 
to use the rule that a large contrast should be regarded as the interaction between 
whichever pair of main effects is the larger, but this rule will break down in some cases when, 
for example, the contrast has two aliases AB and CD, and effects A and C are large and 
B and D small. In the case of a series of experiments, a device which might be helpful is the 
use of permutations of the possible identity relationships, one at each centre. The modern 
emphasis in agricultural experimentation ig on series of experiments at various places and 
in several years, rather than on individual experiments. Interactions of pairs of factors 
will be estimated correctly from a large series of experiments if treatments are assigned 
at random to blocks. 

The evaluation of two-factor interactions for individual experiments depends on the 
assumption that block-treatment interactions are small compared with the experimental 
error. Yates (1935) examined several experiments for the existence of such interactions and 
found no evidence of them. Since that time a large number of experimental results which 
can be used to provide information on the question have been accumulated, and an investiga- 
tion of these has indicated that block-treatment interactions are negligible and may be 
ignored (Kempthorne, 1947). 

With regard to the estimation of error, in so far as tests of significance are of interest, it can 
be said that the analysis of variance does provide a test of significance of the hypothesis that 
the treatments have an overall effect different from zero. In agricultural experimentation, the 
term error is used to denote block-treatment interactions. Thus in the simple randomized 
block experiment, it is possible to evaluate the difference between two treatments from each 
block, and it is the variability of this difference from block to block which is regarded as the 
error. In general, as there are usually few blocks, and the error of each comparison would be 
determined with poor accuracy, the errors of all the possible mdependent comparisons 
are pooled to give a common estimate. If the treatments were duplicated at random 



270 Confounding and fractional replication in factorial experiments 


within each block, the analysis would be of the form (r being the number of bloeks and 
t of treatments) : 

D.I’. 


Blocks r — 1 

Treatments t — 1 

Treatments by blocks (r — 1) (i — 1) 

Within blocks rt 


irt-l 


The component ‘within blocks’ could more accurately be described as experimental 
error, but would not be used to evaluate the errors of treatment effects, since the experi- 
menter is interested in the constancy of treatment effects from block to block. There is 
therefore little point in actually carrying out such an experiment. In a factorial experiment 
with replication, the components which could be evaluated consist of replicates, effects and 
low-order interactions, high-order interactions, and interactions of treatments and repli- 
cates. On the assumption that the sum of squares for interaction of treatments and 
replicates is homogeneous, the mean square for high-order interactions will include the mean 
square for treatments x replicates plus a component of variance due to high-order inter- 
actions. When only one replication is used, it is assumed that the component of variance due 
to high-order interactions is small, and that the high-order interactions mean square can be 
regarded as an estimate of error. It is important to bear in mind that an individual agri- 
cultural experiment can give information only for a particular set of experimental conditions 
and that it is known from experience that place to place and year to year variability is 
considerable. It w;ould therefore be uneconomical to utilize available resources to determine 
effects and their errors at a few particular places very accurately, but preferable to sacrifice 
replication at each place in order to have information over a large range of experimental 
conditions. 


Mixed systems 

It is not proposed to examine mixed systems of the type where p and q are primes, in' 

the present paper. It is clear, however, that the possibilities of complete confounding and 
fractional replication are very limited. A p’th replicate must obviously include 
combinations of the m factors combined with all the 2™ combinations of the n factors. Tor 
the examination of treatment aliases the system may be regarded as the product of the two 
separate systems. Thus ifp = 3 ,m = 2 , 2 = 2 , w =3 and the factors are J/i 2/22/3 2/4 2/6, then 

a half rephcate would be obtained by putting 1 — 2/32/42/6- The aliases which result are 

exemplified by the following : 

Vi = ViyWiV's , 2/1^22/3 = 2/12/22/42/6 , 

2/12/2 = 2 / 12 / 22 / 32 / 42/6 , 2/3 = 2 / 42 / 6 - 

Such designs with fractional replication or complete confounding are therefore useful only 
when the corresponding designs for the two separate systems are feasible. 

Comments on ‘the design oe optimum multieactorial experiments’ 

In a paper entitled ‘The Design of Optimum Multifactorial Experiments’, Plackett & 
Burman ( 1946 ) put forward designs more specifically for physical and industrial research, 
which are of interest from the point of view of fractional replication. In order to estimate the 



271 


0. Kempthoene 

effect of varying nine components) of an assembly, each component having two possible 
values, a nominal ( — ) and an extreme ( 4- ), they put forward the following design which 
requires the testing of sixteen assemblies: 






Components 





1 

2 

3 

4 

5 

6 

7 

8 

9 

Assembly 1 

■f* 

— 

— 

— 

-f 


— 

4- 

4- 

2 

“h 

+ 

— 

_ 

— 

-t- 

— 


4- 

3 

-h 


■f 

— 

— 

_ 

4* 

— 

— 

4 

+ 

-i- 

+ 

4- 

— 

- 

— 

4- 

— 

5 

— 

+ 

-h 

4" 

4- 

- 

— 

— 

4- 

6 

+ 

- 


4- 

4' 

4- 

— 

— 

~ 

7 

— 

-h 

— 

4- 

4- 

4- 

■f 

_ 

— 

8 

+ 

- 

+ 

- 

4- 

+ 

4" 

4- 

- 

9 

-f 

4- 

— 

4- 

— 

+ 

'4" 

4- 

4- 

10 


-t- 

+ 

- 

4- 

- 

4" 

4- 

4- 

11 

— 

- 

-H 

+ 

_ 

4- 

- 

4- 

4- 

12 

+ 

— 

— 

4- 

4- 

- 

4 - 


4- 

13 

- 

4- 


- 

4" 

4 - 

- 

4 - 

— 

14 

— 

- 

4 - 

— 

- 

- 1 - 

+ 


4 - 

15 

“ 

- 

-- 

4 - 

- 

~ 

4 - 

4 - 

- 

16 

— 

— 

— 


— 


— 



_ 


Yates put forward a similar design in his 1935 paper for the weighing of a number of small 
articles on a balance which required a zero correction, as an example of the estimation of the 
effects of independent factors. In his case there was a close formal analogy to the 2" factorial 
system, and it will now be shown that Plackett & Burman’s design given above is a high- 
order fractional design of the type discussed in the present paper. 

Denoting the nominal values by unity and the extreme values of the nine components by 
a,b,c, d, e,f, g, h, h in order, the treatment combinations represented are Z, aehi, abfi, abcg, 
abcdh, bcdei, acdef, bdefg, acefgh, abdfghi, bceghi, cdfhi, adegi, befh, cfgi, dgh. It is found merely 
by one-by-one examination of the three-factor interactions that all the above sets of treat- 
ment combinations occur with the same sign in the following: 

ABE, AGK, BGF, GDG, DEH. 

The same will be true for all the members of the Abelian group of which the above fiveinter- 
actions are generators. The identity relationsliip is therefore: 

I =ABE =AOK =BCEK =BGF =AGEE = ABEK =EFK 

= GDG = ABGDEG == ADGK = BDEGK = BDFO = ADEPG = ABODFQK = GDEFQK 

==DEH =ABDH = ACDEHK = BCDHK = BGDEOB = AGDFH = ABDBFHK =DFHK 

= OEGH = ABGGH = AEGHK =BGHK =BEFGH =AFGH = ABOEFGHK = OFGHK 

The identities of interest to the experimenter are the following: 

• I = ABE = ACK = BGF = EFK = GDG = DEH) 
from these we derive the following aliases for main effects: 

A = BE = GK, F = BO ^ EK, 

B==AE^GF, (? = GD, 

G = AK = BF = DG, J? = DE, 

D=0G = EH, K=AG = EF. 

E=-AB==FK = DH, 

In all oases, the contrasts estimating main effects are minus the contrasts estimating inter- 
actions. If, for example, the interaction of B and E is negative, and A has no effect, the 



272 Confounding and fractional replication in factorial experiments 

conclusion drawn by the experimenter will be that A ias a positive effect. It is possible but 
rather difficult to imagine physical systems in which effects will not interact, and interpreta- 
tion of the results of experiments based on this design may often be impossible. With nine 
factors, it appears from the present work that the minimum number of combinations which 
should be tested is 1 23, that is one-quarter of a replication, though it is possible that by making 
less stringent assumptions about two-factor interactions, one-eighth of a replication might 
give intelligible results. A possible instance in which it might be feasible to use the designs 
discussed is when it is expected that only one or two of the factors have an effect, and the 
problem is to determine as quickly as possible which of the nine factors are responsible. 
An example in which a high-order fractional design was used in such circumstances with good 
results has been described by Tippett (1936). A detailed examination of all the designs put 
forward by Plackett & Burman will not be undertaken, but the lines on which such an 
examination would proceed and the broad conclusions which would emerge are obvious 
from the above examination of one of their simpler designs. 


Conclusions 

A method of examining fractional replication and confounding for some types of factorial 
experiments is described. The formal equivalence between the two is indicated and the 
implications of this equivalence discussed. Further progress will follow on group theory 
lines and this is being examined, together with the possibility of fractional replication when 
the fraction is greater than unity. The possibilities are explored of the estimation of main 
effects and two-factor interactions of many factors by testing only a small proportion of 
the possible treatment combinations. An examination on these lines is made of designs 
proposed by Plackett & Burman. 


REPEKENCES 

PiNNEX, D. J. (1946). The fractional replication of factorial arrangements. Am. Eugen., Lond., 12, 
291-301. 

Fishbb, R, a. (1942). The theory of confounding in factorial experiments in relation to the theory of 
groups. Ann. Eugen., Lond., 11, 341-53. 

Kempthobtste, 0. (1947). A note on differential responses in blocks. J. Agrio. Sci. 37, 245-48. 

PiiAOKiTT, R. L. & Bubman, J. P. (1946). The design of optinaum multifactorial experiments, Bio- 
metriha, 33, 305-26. 

Tippett, L. H. C. (1936). Applications of Statistical Methods to the Conlfol of Quality in Industrial 
Production. Manchester Statist. Soc, 

Yates, F, (1935). Complex experiments. J. E. Statist. Soc. Suppl. 2, 181-223. 

Yates, F. (1937). The design and analysis of factorial experiments, Tech. Commm. Imp. Bur. Soil Sci., 
no. 36. 



[ 273 ] 


A COMPARISON OP STRATIFIED WITH UNRESTRICTED RANDOM 
SAMPLING FROM A FINITE POPULATION* 

By P. ARMITAGE, B.A. 

1. Introduction 

M. We are concerned in this paper with the problem of estimating the mean value 
of a variable a; in a population, by taking a sample which is in some way representative of the 
population. It has been realized since Bowley’s paper (1926), and more particularly since 
Neyman’s more comprehensive survey (1931), that a certain degree of precision in the 
estimate can often be obtained more economically by stratified random sampling (usually 
referred to merely as stratified sampling) than by unrestricted x'andom sampling (usually 
called merely random sampling). In the stratified method, the population is divided into 
several strata, the sample size divided in some prearranged way among the strata, and 
sampfing performed at random from each stratum. In unrestricted random sampling, a 
random selection is made from the whole population, and the method may be regarded as 
a particular case of stratification, where the number of strata is one. 

Some text-books deal briefly with stratified sampling. Wilks (1943) considers only 
infinite populations, and denotes by representative sampling what we should call a par- 
ticular type of stratified sampling (see § 1-2). The subject is treated by Kendall (1946, 
pp. 249-62), but he makes no comparison with unrestricted random sampling. We shall 
begin by introducing several well-known results which wiU be needed later, 

r r 

1-2. The summation sign 2 will be used throughout for S , and S for S • In general, 

l-l k k=l 

2 is used for a single summation, for a double summation, and the suffix k where no 

k 

summation is involved. 

We shall consider the following position: A population tt of size N is subdivided into r 
strata, of size = N). The variable x is distributed so that the mean and variance 

(divisor Nj^.) within tTj^ are respectively (t%. It is required to estimate p = the 

grand mean. 

Suppose a given sample size, n, is divided so that n^. items are sampled at random from 

7i&(S% = %). We may denote the jth observation from the fcth sample by a:j,^(j = 1,2 %), 

and the mean and variance of the fcth sample by % and s%, which are known to be unbiased 

estimates of and , respectively (see, for example, Kendall, 1943, p. 284). 

It seems intuitively obvious to take as our estimate of ji, 

m = (i) 

which is clearly unbiased. This is, however, not the only unbiased estimate which is a linear 
function of the x^.^. For instance, also satisfies the conditions. Neyman (1934) 

has sho-wn that, for fixed values of n^, the estimate given by (i) is the best linear unbiased 
estimate of /i, in the sense that its sampling variance is less than that of any other linear 
unbiased estimate. 

* Communication from the National Physical Laboratory. 



274 


Comparison of stratified with unrestricted random sampling 

The question, now arises : given a sample size n, how shall we choose the so as to minimize 
var (»u), where w. is given by (i)1 Bowley had not considered ‘best’ estimates, and he sug- 
gested that'Tij. should be proportional to he. 


n 


k 


N • 


(ii) 


Neyman (1934) showed, by the method given in §2, that the values of w*. which minimize 

nN^cMKA-l)-] 

SViVW/W-')] 




uNkar'k 


(iii) 


where = <r^. ~ 1 ) ]. 

We shall refer to these two methods of defining the n^., by (ii) and (iii) respectively, as 
proportionate sampling, and optimum stratified sampling, denoting by and the estimates 
of fjt, obtained from (i) by the two methods, and by x the estimate of ji given by the mean of 
an unrestricted random sample of n from the whole population it. 

The optimum stratified method thus requires a knowledge of the ctj. . In practice, we should 
never know the cTj. exactly, unless the population had been subjected, to exhaustive sam- 
pling, in which case yU would be known exactly . Sukhatme (1936) has shown that, at any rate 
for large JV^, if the (r| are estimated from a preliminary sample, and the defined by using 
these estimates in (iii), there is a high probability that var (m,,) < var (mj,).* The efficiency 
of this method will of course depend on the size of the preliminary sample, and Sukhatme’s 
investigation only dealt with one value of this (16 from each stratum), In some cases we 
should be able to form a fairly good estimate of the cTj. from past experience, and there would 
be no need for a preliminary sample. 

Another interesting comparison which has not been extensively investigated is that 
between optimum stratified sampling and unrestricted random sampling. Wilks (1943) deals 
with this for infinite populations, and obtains (pp. 88, 89) the result (in our notation). 


var(my,)^var(TO^)<var (a:), (iv) 

the first equality holding only when all the 0 -^, are equal, and the second only when all the 
are equal. (Our N/fN are replaced by pj., where p* is the probability that x, when drawn 
at random from tt, is a member of so that, for instance, (iii) becomes 


= 


nPkO'k 
SiPWi ' 


Representative sampling as defined by Willrs is what we should call proportionate sampling.) 
We shall show in § 2 that for finite populations, while the relation 


var (Wo) < var (m„) 


(V) 


is always true, the equality holding only when all the crj, are equal, it is not necessarily true 

var (mp) < var {x) , (vi) 

* No confusion need arise from the fact that the symbol and the term optimum are still used 
■when estimates of the o’), are used in (iii). ' 



P. Armitage 


275 


and in fact in the limiting case -when all the /tj. are equal, it is true that 

yar(?np)>var(S), (via) 

so that if the (Tj are also equal var (m,,) > var (S) ; (vi6) 

i.e. random sampling gives a more accurate estimatfe of the mean than any stratified sampling. 
We shall see, however, that in almost all practical cases (iv) is true. 


2. Dbbivatioh of fobmulae 

2-1. Results (iii) and (v). Using the notation of § 1-2, we have the standard result that 


var ( 


Therefore from (i) , var (m) = 2 




(vii) 


(viii) 


The result (iii) may be obtained quite easily by finding the values of the W; which minimize 
(viii) subject to the condition == n, using the method of Lagrange multipliers. Then, 
substituting (ii) and (iii) in (viii), and applying Schwarz’s inequality, we have (v). The 
following method is due to Neyman. 

It may be verified from (viii) that 

% n / \ N J 




(ix) 


If we denote the three terms of (ix) by A, B and G, so that 

var (m) — A + 3 — 0, 

it will be seen that A and G are independent of and, since B is non-negative, it follows that 

the values of which minimize var {m) must minimize B. Now N = 0 if and only if 

wJVfcO-; 

whiol| is (iii). For these values of %, m = wio, and 

var (Wq) — A — C. (x) 

If we define n^. by (ii), so that m = we see from (ix) that B = G,so that 

var(mp) = .4. (xi) 

From (x) and (xi), we obtain (v), the equality holding only when <7 = 0, which is true only 
when = 0 for all k, i.e. when the are all equal. 

2-2. Unrestricted random sampling. The variance of a random observation a: from n is 

SiVicrf 


0-2 = • 


N 


■AS, 



276 Comparison of stratified toith unrestricted random sampling 

where S is the weighted sum of squares of the 7/7, i.e. 8 = . From (vii), 


var (x) 


(j-i 


VJ7cr*+— — 8 


nN{N-l) 
N —n 
'nN{N-l) 




From (xi), 

var(TOp)- ^2^ 'LNjof. 

Denoting 

by H, and by K, we have 


r-r N-n^,.N-n c,) 
var (x) = — U + — p— 0 

' ^ Nn {N-l)n 

and 

var(mp)= K. 

Now 

H n{N-1) 


(xii) 


and if we regard each Ni as heing of the same order, 0{N), then if - Jf is 0{N'^), which means 
that when all the are equal, 8 = 0, and so 


var(«)< var(TOj,), 

which is (via) ; but as iV co , var (x) ~ var (mfi + 8/n, (xiii) 

giving Wilks’s result (p. 88) that for infinite populations (vi) is true, the equality holding 
only when all the /ij, are equal. 

From (v) and (via) it follows that for finite populations, when all the are equal and all 
the cTj are equal, (vi6) is true, i.e. in this case unrestricted random sampling is actually 
better than any stratified random sampling with the same sample size. , 


3. Geheraij comparison 

3-1 . From (ix), (x) aiid (xii), 

,S . («) - var K) - S - (P-Q-R). (*iv) 

where P = > 0 (equality if all uj are eqiial) , 

Q = nWo'i^-N^o'i'^)<0. 

M = N^'Zof-{^Nia'i)^>0. 

As iV->oo, P, Q and B are respectively 0{N^), 0{N) and 0{N^), and so we have the result 
that for infinite populations (j)^0, which with (xiii) is easily seen to be equivalent to Wilks’s 
result (iv). 

In the’ finite case, however, by suitable choice of the crj and n we can make (j) either positive 
or negative. For instance, if the crj are all equal and n is sufficiently small, R predominates 
in (xiv), and (^ < 0. As n increases to N, <f> increases to 0. (By considering Q and B, it is not 



277 


P. Aemitage 

obvious that ^ 0 in. this case, but it must be remembered that (xiv) is only true if the % 

are given by (iii), and this becomes impossible as n approaches N. This will be remarked 
upon below.) If the crj are sufficiently unequal, P will predominate and ^ > 0. In this case 

the factor positive, and (j) will decrease as n iucreases. 

The situations, then, in which (vi6) is likely to be true (provided that the Wj. are really 
given by (hi)) are when the /ij, are nearly equal, and when N is small or the cr^ are nearly 
equal. We shall consider some examples in § 4. 

3-2. In applying the procedure of stratification, we shall make two departures from the 
theory outlined above which will tend to nullify the advantages of the stratified method. 
The first is that, as was pointed out in § 1-2, we shall never know the cT;^ exactly, and the 
degree to which our estimates from which the % were obtained are accurate depends on the 
circumstances. It seems quite likely that Sukhatme’s result will be fairly well applicable to 
finite populations, but there is an opportunity for research on this point. 

The second respect in which we depart from theory lies in the fact that, even if the cr^ are 
exactly known, the that we choose can never be exactly as given by (iii) ; first because they 
must be integers, which makes a considerable difference when n is small (the size of the 
smallest stratified sample from which an unbiased estimate of /i can be made is clearly r); 
and secondly, n^. cannot take values greater than N,^. In this latter case, if the values of, say, 
s of the %, as given by (iii), are greater than the corresponding we should let 
for these s strata, and then set the other (r - s) values of proportional to the corresponding 
This will clearly decrease var (»)— var(wi 5 ) as given by (xiv). For example, when 
n = N,vfe have . 

var (g) = var (mj = 8 = 0 , 

but the right-hand side of (xiv) 

(equality holding if all the crj are equal). In fact both these limitations will decrease the 
theoretical advantage (if any) of stratified over random sampling, and we must take them 
into account in assessing the relative merits of the two methods. 


4. Examples 

In the four examples illustrated by Figs. 1-4, var (m^) and var {x) have been calculated for 
different stratified populations, and ^ = logio{var (K)/var (m,,)} plotted against c = nlN, 
so that ^ < 0 if var (®) < var (m^). In each figure the different curves represent populations 
with the same o-^,, with the iVj. in the same proportions but with different magnitudes, and 
with the equal, so that S = 0. 

ExamfUl. (7j. = 2, 3, 4, cx) 6, 4, 3 (3^= 66,26,13). 

Example 2. crj. = 4, 5, 6, 3 ^ 006 , 6,4 (37 = 120, 60, 30, 15). 

Example^. iTj, = 4,6,6, 37j,co3,ll,4 (17 = 126,54,36,18). 

Example i:. o-j, = 1, 1, 2, 3,4, 3rj,oo5, 5, 1, 2, 3 (3^= 128,32). 

The first thing to be noticed about the graphs is that in each one ^ increases, generally 
speaking, as n increases. Further, in any one example the range of c. for which ijr <0 increases 





P. Armitage 279 

as N decreases; and in this sense we can say that for small samples of proportionate size 
from a stratified population, the advantage {if any) of the stratified method decreases as 
N decreases. 

Secondly, the curves are not smooth. The reason for this is clear. In the optimum stratified 
method the % are to be chosen approximately proportional to (a second approximation 

is Example 1, the are all equal, and it follows that the iif. should he 

nearly equal. l^n^O (mod 3) this can be done, but for ft s 1, 2 (mod 3), var (m^) takes values 
greater than it would if fractional nj^ were allowed. This produces a rise in the curve of tjr 
for 0 (mods), which gradually disappears os n increases since the effect is much greater 
for small ft. The same 'period^ is noticeable in Eig, 2, but in Figs, 3 and 4, where the main 
‘periods^ are respectively 15 and 30, the effect is smaller. 

We saw in § 3-1 that, broadly speaking, the advantage of the stratified method decreases 
as the cTf^ tend to equality. This is illustrated by comparing Examples 1 and 2. In each of 
these the are equal, but in Example 2 the are proportionally more nearly equal, 
Comparing curves for about the same N {N ^ 66, 26, 13 in Fig. 1 with N 60’, 30, 16 in Fig. 2), 
we see that in Fig. 2 the range of values of c for which ^ < 0 is greater than in Example 1, 

Fig, 3 has the same as Fig. 2, but the and therefore the ft;,, are different. The 
curves are similar to those of Fig. 2, but the stratified method is still less advantageous 
(especially for small values of c) , 

Example 4 has five instead of three strata, and there is quite large variation between the 
Ofc and between the There is no doubt here that ^>0, the only exception being for 
iV^ = 32, ft =5 6, where ijf — —0*02. 

TJiese examples may be said to give the maximum advantage to the stratified method, in 
the sense that tlie calculated values of var depend on the best method of choosing the ftjf 

If the (T)^ are not sufficiently well known to enable the best values of to be used, then we 
sliall get a larger value of var{mj,). It must be remembered, however, that in all these 
examples we assumed that there was no variation between the a situation which would 
be very unlikely to occur in practice. Now it is clear from (xii) that if the same and 
are considered as in one of the above examples, but the axe now unequal, the effect is to 
increase the value of var(^) by (N -ft) Sj{N - l)n, where S ~ so, in any 

example where ^ < 0 for some particular values of N and ft, we can reverse the direction of 
the inequality by choosing a sufficiently large value of S, say 

Sq - [var (mo) - var (^)] — 1 ) ft/(iV — ft) . 

In comparing different values of for different examples, it must be remembered that the 
order of magnitude of jS^o depends on the cr^ and a suitable measure of comparison will be 
where crj is the pooled variance within strata = '^NiO-yN. 

In Example 1, the largest value of is for W — 13, ft = 4, Here var (mo) = 1*9172, 
var (S) = 1*6677, and = 1*917 — 0'231t7g, (If = 0, /ig — 2, /tg ~ 3*6, then S — 2*066.) 

In Example 2, the largest value of is for iV = 16, ft = 4. Here var (mo) = 24*647, 
var (®) != 19*119, and = 28*14 = 0*289o'§. (If = 0, = 7, — 13, then 8 = 28*51.) 

In Example 3, the largest value of Sq is for W = 18, ft = 3. Here var (mo) = 46*236, 
var(^) 30*623, and Sq — 63*42 = 0*616o*g. (If/ij ™ 0, /tg 8, = 17, then 8 = 67*6.) 

In Example 4, the largest value of jSq is for N = 32, ft « 6 (the only occasion in this example 
where ^<0). Here var (m^) — 0*91406, var (55) — 0*87097, and (Sq = 0*2474 — 0*049a'g. 
(If ^^2 = 0 and /Iq — fii ~ = 1, then 8 = 0*286,} 

Blometrika 34 


19 



280 


Oompctrison of stratified with unrestricUd random sampling 

6. COHOLTJSIONS 

We hare seen in §3 that optimum stratified sampling may give a less accurate estimate of 
g than unroBtrieted random sampling when the are nearly equal, and when N is small or 
the (TjJ are nearly equal. The examples of 1 4 bear out these conclusionB and show that the 
effect is greatest for small w, l?ig. 3 providing an additional suggestion that if the products 
are widely different the advantage of the stratified method tends to he nullified. In 
practice, we should probably only apply stratified sampling if we hnew that the strata were 
sufficiently distinct to ensure considerable variation between either the or the cr^.. In 
the first case, if nothing much was known about the Uj. and a preliminary sample on the lines 
suggested by Sukhatmo was impractioable, we should use proportionate sampling, and the 
sizie of S would usually ensure that var(mj,)< var(K). In the second case, we should use 
'Optimum stratified sampling, and rely on the variability of the cTj^. to ensure that 
var < var {x). Since an adequate degree of knowledge about the 0 -}^ would be unlikely 
unless the Nj/. were quite large, we should in this case almost certainly be safe in using the 
method. To the ahove considerations must be added the fact that if very inaccurate estimates 
of the (fk used in (iii) , then, whatever the nature of the population, the resulting procedure 

may be extremely inefficient. 

It must be realized, of course, that even if it were Imown that var (%) < var (ir), it would 
not follow that the optimum stratified method would necessarily be the most convenient. 
It may be impossible, or at any rate inconvenient, to do any sort of random sampling, and 
some sort of guasi-random sampling may have to be used (sec. e.g. Madow & Madow, 1944), 
but if the principle of random sampling is applicable the stratified method is not likely to 
be much more inconvenient, and in fact in most oases will be more convenient, than the 
unrestricted method, 

Summary 

The stratified method has been used in the past almost solely for large-scale social and 
agricultural surveys. Here the stratum sizes are large, and known results for infinite popula- 
tions apply, There seems no reason why stratified sampling should not be used''to advantage 
for smaller populations, and it is important to know to what extent these results still apply. In 
this paper a comparison has been made with unrestricted random sampling in the usual case 
where we are interested in estimating the mean, The advantages of the stratified method are 
modified, but in most cases where the method is applicable it will be found to be worth while. 

The above work was carried out as part of the research programme of the National 
Physical Laboratory, and this papbr is published by permission of the Director of the 
Laboratory. The author desires to acknowledge the assistance rendered by Mr D, V. Lindley 
who prepared the diagrams. 

references 

BowIiEy, a. L. (1926). Moosuremenb of the prooiaiou attained in sampling. Bull. InU Statist, 
22, l^ro livTftison. 

Keindaxl, M. G. (1943). Tfte Advanced Theory oj Statistics, 1. Griffin and Co. 

Kekdall, M. G. (1946). The Advanced Theory of Stafdstics, 2. Griffin and Co. 

Madow, W. G. & Madow, Ls H. (1944). On the theory of systernatio sampling. Ann, Math, Statist. 
15, 1-34. 

Neyman,- J. (1934). On the two different aspects of the representative method. J, Roy, Statist. Soc, 
91, 558-626. 

Sukhatme, P. V. (1936). Contribution to the theory of the reprosontatlvo method. Su2ipl, J, Roy, 
Statist, Soo. 2, 353-08. 

WUiKS, S. S. (1943). Mathcnmticcd Statistics, Princeton University Press, 



[281 ] 


SOME THEOKEMS ON TIME SERIES. I 

Br P. A. P. MOBAK 

Institute of Statistics , Oxford University 

One of the principal problems in the theory of time series is to discuss the relation between 
two series, and in the present paper we prove a theorem by which we oan test whether two 
such series are independent. Such a test of signihcanc© must depend on tJie models which 
w© assume for the probability processes which generate the series. In praotiGe, the two most 
useful models are, drat, that of a moving average of a series of independent random com- 
ponents and, secondly, the solutions of linear stochastic difiFerence equations. 

1). #),#+!)> ". 

be a sequence of independent random variables each distributed in the same distribution 
which we take to have zero mean and its second, third, and fourth moments finite. Then the 
time series generated by n 

X{t) = s oLi-nit-i) 

i=0 

is a moving average with weights On the other hand, consider a stochastic difference 
equation of the form X(j)+ajX(J-l) + ...+aAX((-fe) - (1) 

In order that the soluMon of (1) for successive values of i shall form a stationary series it 
is necessary to impose the condition that the^roots of the characteristic equation 

= 0 (2) 

shall all lie inside the circle | | ^ 1 (Wold, 1938, p, 63). When this is true the solution of (1) 
can be shown to be of the form m 

i»o 

CO 

where the cc^ are certain functions of the roots of (2). In fcJiis ease S I | is majorized by a 

ia, 0 

convergent geometric aeries. 

Thus we see that both the above models are included in the more general one in which 
wc define X(^) as given by » 

i" 0 

no 

where the a? are any sequence of constants satisfying S | | <co. Now suppose 

i“0 

is another sequence of independent random variables having a distribution with zero mean 
and finite second, third and fourth moments. We write 

i-O 

.co 

where | | < 00 . T^o discuss whether two such empirical series of this form are correlated 


^-o 


we prove that the covariance 


n 


S - 7(0 

1 


(3) 


19.2 



282 


Some theofems on time series 


tends, as n increases, to be distributed in the normal form about zero mean with a second, 
moment which is a function of the ot^ and the We shall discuss later the oaloulation of 
tiris second moment from empirical series, in which case some care is necessary. 

We first illustrate our method of proof hy considering the much simpler problem of deter' 
mining the asymptotic distribution of the sum 

(4) 

a-l 

We shall show that this asymptotic distribution is alsO;, under certain conditions, normal. 
This result is interesting because it establishes a central limit theorem (and therefore a law 
of large numbers) for stationary stochastic processes of this type. The law of large numbers 
for Markov chains has been considered by several writers, in particular Bernstein (1927), 
who proves his results by using central limit theorems for non-independent components. 
His theorems cannot be applied in the present case, but some of the ideas of his methods can. 
Consider (4) above, where -3L(i) is defined by 

e=*o 

00 

and S ( [ is convergent. There is no loss in generality in supposing that 
0 

S |ai|<l. 

Clearly ^(^^n) - S S 

3=1 i“0 
«=0, 


Write er8 = c, = i?[Z(i)^], c, = E[X{t) X{t - s)], 

Tlien Cg = -I- af -h + ...), 

which are both clearly convergent. Moreover, 

•f=l 3*=1 

tends, as n increases, to -b 2 X 

if this series converges absolutely. We shall show that lim is finite. For Eq is clearly 

(6 

and this is finite. Moreover, we notice that is not greater than 


n tends, as n increases, to 


We must now impose the condition that is not zero. This condition is necessary to our 

i“0 



P. A. P. MORAJf 


283 


method of argument. If it is not zero, it may he assumed, without loss of generality, greater 
than a positive number. We now show that as n increases 




tends, uniformly in and to 


77^* 


er^^dt. 


We require the following lemma (Bernstein, 1927, p, 12); 
Lemma I. Let 

where p„, and cr,( are random variables such that 
Then if, for n large, 

ck 

tends, uniformly in and to 7r~^ dt, 

Jk 

then < Pn < 

where - E(pl) tends, uniformly in tg and to 

J/o 

provided that 


lim ^ = 0, 

r 


Let e be an arbitrarily small number and choose N so large that 

ro CO 

S \(x-t\<e'Z K-I<e. 

injif i»s0 


Write 


Xi(t) = s n = 2 x,{t-s). 


Then E{Ti) = 0, 

and write = E{Tl^). 

We shall prove that the distribution of T!^ tends to normality, i.e. that 

jDr{to(2-B;)!<2’;<M2ie;)>} 

rfi j 

tends, uniformly in ig and to 7r~l 

ht 

We first calculate and in another way. Por 
T 

9=1 

and so Rn = E{TD = 0 -^ (a§ + + . . . + (ag + . . . + + £ (a^ ^* . . . + j , 

{ 9=1 ) 


: X X{t-s)= S S 0Ci7}{t-s-i] 

B=1 9=1 i«0 



284 Some theorems on time series 

and this series oonvergea. On the other hand, 

+■ S (0to+ -- 

17-1 


and so 


+ ((Xi +... + Otj^} - 1 )+... -H - iV^}, 

= o'^ao + ■ ■ ■ (^^0 + + - . + (e^o + . . . + 

+ {?! ^ N') ((Xfl + 4 . . + <x>{^Y 4 (^1 + ♦ • • 4" oiiv)^ + • * * + 




Since we have already supposed that U is positive, there exist positive numbers Nq and 

i=o 

N 

d such that for all S > d. If this is not true the theorem is in general false. For 

i- 0 

suppose the distribution of the ly’s to be non-normal and write ao - a^ — — I, ~ 0 
(i> 1). Then the distribution of ?*„ does not tend to normality and its variance does not 
increase with n. We shall later show that this condition on the a’s is in fact satisfied for tlie 
solutions of stochastic difference equations. 

Now by the ordinary central limit theorem, as ?i inoreasea, 

n-N 

T'l ~ S («()+ -4 — ji) 

3J-1 

tends to be distributed normally with zero mean and variance 

Kl - (??' - W) {«g . -f cc.^)^ 

that is MWK)^ < 


tends, uniformly in and to 




(■-‘‘dt. 


Uaing Lemma I we see that the same ie true, for fijced N, when we replace by and iJj,' 
byj«;. Now v -'T'j.n 

•^11 "" n "T 

say, where Q is what we get if we replace the sequence (acCCi, ...) in by (a^y.^, .,.) and 
alter i, and from (5) we can choose N so large that for n>N^ n-^E{Q'^) < e, say. Taking a 
sequence e^, ... tending to zero and choosing first N sufficiently large and then n and 

uaing Lemma I again, we see that 


tends, uniformly in and to 


TT-I f‘‘ 




To complete the discussion we must show that the condition we have imposed on the 
sequence ... ia satisfied by the coefficients of the solutions of stationary stochastic 
difference equations. Consider an equation 

^(0 1) + - 4 . = ‘r}{t) 

such that the roots of -f . , , -f = 0 ,{6) 

ail lie inside the circle \ z\ = l. Then the solution of this equation is given (Wold, 1938, 
P. 63) by 

X{t) = 2 a£r/(i^i), 

i = Q 



286 


P. A. R Motun 

where the ct^ are now the solutions of the infinite set of equations 

^0= 1 

a>tCtQ^cii = 0 , 

=0, 

••• 

^h<^i 4- ... = 0, 


and since the left-hand side is an absolutely convergent double series, we add, obtaining 

W 

i-O 

and so X 4= 0 and, as already observed, without loss of generality, may be supposed posi- 

tcO 

tive. This quantity is finite because ail the roots of equation (6) lie inside the circle 1 2 : | 1. 

Moreover, it follows that 

X och cr®“ (1-f 4- 

i-o J 

This is, in fact, proportional to the derivative at zero of the integrated power spectrum 
(Wold, 1938, p. 69). 

We now turn'to the problem of disoussing the relation between two such series and we 
consider the asymptotic distribution of 

' OD 

where ■ X(i} ~ X (^j + O), (7) 

and (A 4 = 0 ), (8) 



We write in this form rather than that of (3) for the sake of convenience in what follows, 
and we have altered the notation of the sums (7) and (8) so that they begin with the 
coefficients cti and fii for the same reason. Writing 


BiX{t)X{t-s)], d,^JS[Y{t)Y{t-‘3)] ( 6 - 0 , 1 ,...),. 

as before, we have 

Cg = 4* '"Oj ~ ""ii 

where of and arj are tlie second moments of 7 } and Then 


n 




= E 


n 


t-i 


(f-ii t*»i ^““1 J 


n-l 

= nC(,do4'2 X 


( 9 ) 



Some theorems on time series 


Consider the behaviour of n-^E{S^ as % increases. Clearly 

i: 0, say^ (10) 

S“1 

if thft series Q i& absolvitely convergGiit. If X and Y are moving avoragea or tho aolntlons v>i 
stationary stochastic difference equations this is certainly true, for in the fii‘st ease the series 
is finite, and in the second it is majorized by a convergent geometric series. We show that it 
IS true in the general case by the following argument, Without restricting generality, we 

W 00 

may assume, as before, that S [ «( ( < 1, S | A I < h '^^^en 

<a-|(lail + ...h 


and so 




Also 


S (lAl |Al + '*0 

i)=ii 

<o-5(r|( S I />! l] . 

,<io«^5o-s( ( SJ Al”). 


and so CgdJg + 2 2 is finite. We now prove that G is not zero. Eor 
1 

0 = c„(i„+2 i; (iX = ir!<rir(«i+«i+...)(AU/?i+-)+2 S S S . 

S"! L fl«l TIl=l «=1 J 

and after some rearrangement, this equals 

^i^a[(ai + Mb + ^2^2, + ‘ • .h 

and is greater than zero and therest non-negative at least. We therefore conclude that 


where 


Assuming as before that 


we define N so that 


0<G<<x>. 

<x> 00 

S|ai|<l, 2|A,1<1. 

1 1 

eo ta 

S la^l<eSkt|<e, 

iv+i I ' ' 


We now write 


S |A|<cS|A|<e, where e is small. 

iv+i 1 

ina 1 


m - 2 


and consider the sum 


(16) 



287 


P. A, P. Morak 

We begin by proving that when n is large this sum tends to be distributed in the normal 
form with a variance which is asymptotically equal to wC7i, where Ci is obtained from 0 by 
putting 0 for i > N, For it is then clearly true tliat 

Now consider = £ ^i( - 1) Y^{ ~ t), 

1 

where tt is greater than N. For convenience of notation, we write 

w a> 

Wethenliave S; = S S 

where the are certain constants. Moreover 

B{7}iQ ==« 0 all il, j, 

=cr|£r| alli,^‘, 

^iViCjQ = ^ for j + A;, 

if or k:^l. 

It therefore follows that ~ crlcl ^ 

, i«l ^^1 

Inserting (14) and (15) in (16) we have 

A^j = 0 

if i>n + N^ or or 

and 8!^ — + 

JV N 

where Si S U 

with + for i>j, 

= for i<j, 

= ■ for i=^j. 

We also have = SSA^yj^^p 

where the sum is taken over values of i and ji such that j i—j [ <W, i^n,j^n and either 
N <i or N <j, where 

for j-i=^p>0 
= a^+i/ii + . . . + cCi^/^N-p for i-j^p>0 

= 4’...+aA!^A' foi^ 

Then E{^^) = 0, Em) = crla-lEEA^, 

where the sum is taken over the above values of i and j. This equals 

{n - N) 0*1 <7l[(ai^jv)'‘ + («i Av-i + + > - • 

+ + . . . + ajv^iv)^ + ■ • ■ + (av A&+ A)^]- (I'?) 



Borne theorems on time series 


288 


We know that 0. Let /?, be the first term of the sequence > A 'which is'not 

aero. Such a term certainly exists, Then the sum in the outer brackets of (17) will contain 
a term of the form («! and consequently E{Zl) > 0, and for N fixed will increase as {n~N), 


Next we have 




where either 

or 

and 

Then 


i<^n}j>n and j~i<Ni 
j4n,i>7i and i—j<N, 

= + for i-3 = j )>0 

= aifip+i+-+«N-pfiN for j-i^p>0. 

ms,) = 0, ^ 


( 18 ) 

(19) 


where the aura is taken over the values (18) and (19). 

n-t-N n+N 

Finally S 4 ~ S S 

i-w+1 i=*n+l 

where ^ y - ^t-n^i-p-n + - . + i—j = p>0 

” ^j-p-nPi-^n "t • ' * "t ^jv— for J ^ > 0 

'-^p§p +“• +®^Af/^iv i-j-n+p>n, 

n+2f n+^ 

and •®(S4) = 0, if(Si) = o-fo-i a ^A%. 

i-n+l J>=n+1 

We readily see that E{SiSj)’=0 for t+j 

and therefore ~E{8[,^) = E(S\) + E{£fi + E(EI,) + E{Sl). 

I 

Moreover j for constant N, -^(^1) and ^(X'J) are constant, and so for large n we have 

-^0^ = + * * • 

4- 4 * , . , 4 - 4- • • 4 4" 'h O' (20) 

' n ■ 

Now suppose that N is fixed and consider the sum S^i( — We write 

1 

n “ m{mi-N)-{-pj 


where j)< 2m4- iV 4’ 1 and n is large enough for m to be greater than N. This equation fixes 
m which increases roughly as wi when n increasea. Write 

( N N+m m*+TnN n \ 

s + s +...1- s + s )x,(-t)yi(-t> 

(-1 jV + 1 1 f WCW-l-tH— 1) Tl“J)+l/ 

= +T^2+ .. . 4-F^4- Ci;,4- Tf . 

ThenFi, .-.,Krt ^ independent and E{Vl), i?(FJ,) are independent of ti, and in 

fact not greater than KN, where iT is a constant independent of N. Also M{W^) ia not 
greater than K{2m-\rN 4 - 1), where K may be taken as the same constant, C 4 , are 

also all independent and E(U^) ia asymptotically equal to mG^ when n (and therefore m) 
are large. Therefore, writing 

+ -B„. = T'i + ...+7„+W, 



289 


P. A* P, MOBtAN 

we haT6 E{A„) = 0, E{Al) = 2 E(U% 

i-1 

E{BJ = 0, E(Bl,) = 2 ^:(Ff) + ^(F^). 

i™l 

and tlie latter increases as m, whilst the former increajses as and so 
as 71 inoreases. 

By Lemmal it is therefore sufficient to show that the distribution of tends to normality. 

Lemma II (Liapounoff’s Central Limit Theorem, Bernstein, 1927). If 

is the sum of 7n independent quantities such that 

= 0, = 6W ^ (,«. 

m 

and if, as m increases, H c^’”W0, 

r-l 

m 

where = 2 

r-;l 

then j)r{J,(26„)t<r„<y2i„)‘} 

tends, uniformly in Iq and to 7r~^ e~*“ di, 

Ji, 

To apply the lemma we put I/,. = We already have E{Uf) - 0. Also 

by (20), 

and so 

Now consider = jE/{![7*), 

<» 40 

where ^ = S S 

j) = l g-l 

and the are calculated with m in place of n. Since the all have the same probability 
distribution and similarly for the ^’s, we shall write t}^ and for 
for the sake of convenience. So we can write the above 

CO CO 

Uj. “ S j2 '^pqVp^q' 

P^l g=l 

Z7J will be a polynomial of the fourth order in the T/’a and the ^^s and its expectation may 
be regarded as the sum of two distinct types of terms so that ^ 2^£f(Wi} + £'JS(w 2 ), 

where the terms are of the form Ap^y^ and the terms are of the form 

with All other terms arising in the product will clearly vanish when the ex- 

pectation is taken. 

Then, since the are bounded and the number of non-zero terms in and are not 
greater tlian ‘lN{m -I- N) and 4N\m + iV)^ respectively, we have 

E{U^)<Km\ 

where K is a constant depending on N hut independent of m and n. It follows that - 



290 


Some theorems on time series 


is of order and tends to xoro as n and m increase. The conditions of the lemma are there- 
fore satisfied and we conclude that 




tends, uniformly in and tj, to 



Applying Lemma I wo have 




lends, uniformly in and ti, to the same limit. 

We now consider tlie relationship between and Write 


K = i) - i) 

= .1 {k ,s (l/‘M 

n / 00 \ / 03 \ 

= S S S 

fl / « \ f ^ \ 

+ S s ccintJ 

i-l Vi-Af-fl / Vi-l / 

11 / JV \ f ^ \ 

+ S f S I S 

= TFi + lf2 + F3, (21) 

We must now calculate the variance of these terms. Consider again (9). We liave shown 
(11) that 

1 

<2orao-|(|a3| + |ai| + „.)®, 

and we now apply this to the three sums in equation (21). It follows that if N be chosen to 
satisfy the conditions (12) and (13) then 


\imn-^E{W\) < Kcr\(rl£'^f \\mrr^B{W^<K(T\(r\e^j < /fcrfcr^e^, 


where iC is a constant independent of N, 

It Mows that S„ = S;+.Hi+W,+W 3 . 


where the variance of Wi, Wg and li^ can be made small compared with that of 8^^ by choosing 
N large. Then by first choosing j!V large and then n and using Tohebyoheff ’s inequality, we 
see that the distribution of S^^ tends to normality with variance i?(;Sn) and this completes 
the proof. 

In the general application of the above results some care is needed. We can suppose that 
our empirical values of X and Y are distributed about their sample means which we take 
to be zero and we must estimate the variance of from formulae (9), or (approximately) 
from (10). But we must not insert in this formula the sample oovariancea for the Cj and the 
dg because, as Bartlett (1946) has shown, the standard errors of the sample values of these 
covariances are of order and we cannot therefore expect the aeries (10) to converge, let 



291 


P. A, P. MORA^^ 

alone give the correct value. To use the formiila correctly we must first decide on the order 
and ooeiHcienfcs of the stochastic difFerence equation which we can suppose generated the 
series and, from these coefficients, calculate the value of (9). 

In the case where the series are generated by a three-term difference equation, the calcula- 
tions are simplified. Suppose the X and F satisfy the equations 

X(t4-2)+ aZ(t-pl)-p = + 

F(i-h2)-fA7(t+l) + BF{i) = + 

where JS(')f{t)) = = 0 

and = (Tl = trl, 

as before. Por the series to be stationary, we must have i!)< 1, B< h We suppose that in 
addition to this the series are oscillatory and so <4:B. The solutions will then be 


s»> 0 

rW = i 2(4B ~ A^)-‘ P» sin ^sffl-s+1 ). 

5“0 


where p = b^, F — cog^ = -a(2&^)"i oos^ := Also (ICendall, 1946, p. 408) 

_ Cj __ p* sin {s0 4- P® sin (s0 +0) 

Oq sin^ * ® dg ”” sin® ’ 


where 

and 


tan^ 


1~ p® 

- — ^ tan 0 and tan 0 ~ 
1+p^ 


l-P^* 


Cg = <rf- 


1 + 6 


(l-6){(l + 6)2-n2j> 


j 2 1 + P 


We then need to calculate 


oO 


0 =:Codo +2 S 

flo-l 




Cgdg 1 + 2 


JKSSS 1 


_ j- V ain (s^ + ® ) ) 

^ sin tfr sin® ' 




= Cgdgil.+ n: 


/ 

2pP ro os(^-® + g-^}-pPco3 (^-®) 


( 22 ) 


' sin^8in®|_ l-"2pPcos(0— ^)+p^P^ 

cos (^+®+g + ^)-pPcos(^+®) ' 

1 — 2pP COB {^ + 5^} + p®P^ 

It is probably easiest to calculate G from this equation rather than attempt to simplify 
(22) still further, I hope to discuss the practical application of these formulae in another 
paper. 

REFERENCES 


Wold, H, (L 938 ). A Study in th& Ancdysia of Stationciry Time S&ries, Uppsala. 

Berwsteiw, S. ( 1927 ). Math. Ann, 97 , 1 , 

Bartlett, M, S. ( 1946 ). Supp, J, Boy, Siaiisi. So&. 8 , 27 . 

Kendall, M. G. ( 1946 ). Admnced Theory of Siatiatic^t 2 . London: Charles Griffin and Co. 



[ 292 ] 


RANK CORRELATION BETWEEN TWO VARIABLES, ONE OE 
WHICH IS RANKED, THE OTHER DICHOTOMOUS 

By J. W. WHITEIELD, Psychological Laboratory^ University of Cambridge 

Rank correlation is one of the most useful statistical techniques available for the treatment 
of data arising in experimental and applied psychological research. Chambers (1946) has 
indicated the type of data most frequently occurring in these fields, and has pointed out the 
advantages of KendalVs r over Spearman’s p or any form of transformation to ordinal form. 

Given the use of r when tied rankings are present (Kendall, 1946) it seemed possible to 
extend the method to cover a very common problem in psychology, namely, determination 
of the relation between two variables, one of whicli is expressed as a ranking and the other 
as a dichotomy. In applied or field work the relation of a psychological ‘ measurement ’ and 
an external criterion nearly always appears in this form. The usual method of determining 
the relationship consists of reducing tlie ranking to a dichotomy and calculating for the 
2x2 table which results. That this may lead to inaccuracy can be seen from the following 
example; 

Variable^ 1 2 3 4 6 6 7 8 9 lO 11 12 13 14 15 16 17 18 19 20 

Variable — ' — ■!" ■t't ” — ^ ^ ^ 

Variablo O'— — + + + + + •^4““'— " — — — — 4'4'4' 

Here the data are supposed to be ranked according to variable A and diciiotomized into 

-f and - with respect to variables B and G. 

Treating the relation between variables A and R as a 2 x 2 contingency tal)le; 



■ 

Variable R 


4- 


Variable ilj Rankings 1-10 

7 

3 

Rankings 11-20 

3 

7 


Applying P is found to be 0*074 without Yates’s correction for continuity, or 0*180 if 
the correction is applied. 

But ia exactly the same for the contingency table relating variables A and C, although 
it is obvious from the data that there is considerable difference in the two relationships, the 
evidence for which is sacrificed by reducing the ranking to a dichotomy. 

If, alternatively, we consider the dichotomous variable as a ranking composed entirely 
of two sets of tied rankings, we may calculate the ooeffieients between A and R, A and C 
respectively which 1 shall denote by r The corresponding values of S will be found 
to be, after the manner described by Kendall (1946); 

= + 70 - 9 + 21 = 4 - 82 , 

R = -30 + 49-21 = - 2. 

For the calculation of t in the case of tied rankings we have a choice in the denominator 
by which R is to be divided to give r. In the untied case this would be ^n{n— 1), wliere n is 




J. W. Whitfield 


293 


the number of ranks. In the tied case we may take the denominator S' aa 1) or as 

1) — — 1)}]*, where etc., are the extent of the ties. The choice is 
determined by practical considerations (see ICendall, 1946)j but is not materiai to a discus- 
sion of significance, l?or an untied ranking and a dichotomy with 'x’ and 'y' members in 
each class, the second form reduces to - I)}b 

In the case of two untied rankings Kendall has shown that var jSf “ 1) (2n + 5). 

In the case of one untied ranking and one with ties of extent etc., Sillitto (1047) has 
extended this result by proving that 


var/S = ^(w(ti-l)( 2 ?t-t- 6 )^i:^(i^l)( 2 f.t 6 )}, ( 1 ) 

111 the case of an untied ranking and a dichotomy, = .i;, and we have 

tlion the simple form ^ j, ^2) 

In the example above this gives 


variS^ == 


(IQ) (10) (21) 

3 


700, ^J{Ya,v.S) =; 26*46, 


^ ^ 2—1 
\/(vb,tS) 26'4G 


3*06. 


The probability of a deviation greater than this in absolute value is 0*0022, ITurther, 


_ 

V(var S) 

and the corresponding probability is 0*970. 


2-1 

26*46 


0*0378, 


Vabianc^! when thebe abb ties in the bahkinu 

The variance of S given by equation (2) is true only in the case of a dichotomy and an untied 
ranking. For a tied ranking I surmised from some special cases that 

In the note following this paper Mr Kendall provides proof of this result. 

Bxo/mfla (from data collected by the Medical Research Council team in Germany 1946, 
as yet unpubh’slied). jSelecCed workers in a factory were interviewed and an assessment 
made of their adaptation to living conditions. They were assessed as 'Efficient’ or ‘Over- 
aotive ’ . Other data were available, including statements by the men of frequency of nocturia, 
For men aged 60-59 years the following was observed: 


Assessment 

Rank order of frequenoy of nocturia 
(least frequent nocturia given highest rank) 

Efficient 

Overaotivo 

n, 2^, 2h 61, 4 10, 10, 10, 10, 14, 14 

0. 10, 14, 16, 17 


Five is the highest ranking in the overactive group. Four members of the efficient group 
have higher rankings, and eight lower rankings. The jS' score for that member is therefore 
4—8, Similarly, for all members we have 

4-8 + 6- 2-M0-M2-M2 == +34, 






Bank correlation between two variables 


294 


Using a denominator in the form 

[x>y{\7h{n - 1 ) ~ (i - 1 )}]^ 


T is given by 

34 34 

VK’ 2) (6) {i{17) (16) - i (<i) (») - i(2) ( 1 ) - i(8) (4) - i(3) (2)}] 

From (3) we then have 

® “ sll^) 

= 344'6. 


A small problem arises when we consider the correction for continuity to be applied in 
testing the significance of an observed value of In the case of a dichotomy and an untied 
ranking the interval between successive jS values is 2, In the case of a dichotomy and a 
ranking composed entirely of ties of the same extent the interval is 2L But in the example 
the ties are of varying extent, and the interval between successive S values is composed of 
a mixture of the intervals produced by the successive rank values. Thus, although these 
varying intervals are combined so that over most of the range the interval between successive 
values is unity, the distribution oscillates somewhat, and to use the value ^ as the correction 
for continuity would sometimes be miBloading. Further work is required to determine the 
correction which will provide a probability on the normal distribution equal to or slightly 
greater than the true probability in all eases. Until this is available I pi’opose to use a crude 
correction, based on the average of the intervals mentioned above. In the example the suo- 
oessive rank values 2^ and 6 give an interval of 6 in S score, rank values 5 and 6j- give an 
interval of 3, and it is therefore possible to determine the average interval by calculating 
the intervals given by successive rank values. This calculation can be shortened. The total 
of the 8 score intervals is twice the number of members, less the extent of the ties involving 
the first and last members. If we divide this by the number of intervals between successive 
rank values we have the average 8 score interval, In the example this is ^(34-4 - 1). Using 
half of this as the correction for continuity we have 


8 _ 34-2>42 

^J{vsn'S) 18*66 


1'702. 


The pre-observatlonal hypothesis, made on psychological grounds, was that oxceasive 
nocturia is a symptom of inefficient adaptation to living conditions, i.e. a positive correlation 
should be obtained. From these observations the probability of a positive, correlation as 
great or greater than the observed value appearing by chance is 0*044. Uireob calculation 
of the positive tail of the distribution of 8 gives a probability of 0*0368, 

The alternative testing hypothesis based on the absolute value of 8 gives a probability 
twice as great, and the corresponding direct calculation using both positive and negative 
bails of the actual 8 distribution gives a probability of 0*0736. 

By itself this evidence could only he debatable substaiitiation of the psychological 
hypothesis. In fact, additional data from two other factory groups, treated in the same 
way, gave a total 8 value of + 104, the square-root of the total v ariance being 36*00, providing 
a justification of the hypothesis. 



J* W. Whitfield 


295 


Thjs oastj op the 2x2 table 

If one diohototnoua variable can be considered as a ranking with two sets of tied ranks it is, 
logical to consider the ease when both variables are in this form. If we have a 2 x 2 table 
in the form 


(^B) 



(^b) 

' ■ 

(A) 

{«B) 

(ad) 

(a) 

(B) 

(b) 

N 


any member of (AB) taken with any member of (a&) has the same order in either ranking 
and hence contributes + 1 to /S, and any member of {Ab) with any momher of {aB) con- 
tributes — 1. The others contribute nothing. Hence 

8 = (AS)(a6) — 

From equation (3) 

= 3^^K-iV'~-W)-{W-(S)}-{(6P- (6)}] 

- F:n ‘ 

Again, for testing the significance of an observed value of S it is necessary to correct for 
continuity by subtracting half the interval between successive S values. In the case of the 
2x2 table the interval is Nt for if we increase (J,B) by unity S becomes 

{{AB] + 1} {(cs6) + 1} " {(^6) - 1} {(etB) - 1} == (AB) (ab) - (Ab) [aB] + N, 

Hence, for the normal deviate, we have 


S-W 



(H) (a) (B)(6) ) 
N^l I 


(6) 


It will be noted that t (taking the ties into account in calculating the denominator iSf') is 

(AB) (ab) - (^6) (aB) 

SA)(aJ(B)(b)~^ 


which is the product-moment oori'elation for a 2 x 2 table when the variables are conven- 
tionally regarded as possessing the discrete values 0, 1. 

Testing by use of the normal deviate seems to be moderately accurate, and would appear 
to be useful in those oases where is suspect because of small expectations in the cells of 
the 2x2 table. It is less laborious to calculate than the hypergeometrio treatment, and is 
an alternative form of the approximation to hypergeometric treatment given by Pearson 
(1947), who ako discusaea the order of accuracy of the approximation. 

Using the data given earlier as an example, but assuming that it had been possible only 
to grade nocturia into ‘Normal* or ‘Excessive’, we have the following table: 

Blometrlka ao 




296 


Banh correhtion between ttvo mriables 


Assessment 

Nocturia 

Normal 

Excessive 

Efficient 

Overactive 

10 

2 

2 

3 


£[ = 30^4 = 26, var)S = ^i^^^^[y^=* 226, 

This gives, after correotion for eontmmty, 

±KJ±^- 1.1667 
^(veiS)' IB ~ 

This gives the probability of B being attained or exceeded in the direction of the hypothesis 
(i.e. positive values only) as 0 ' 1217.»,'\;2 ^^ithout the continuity correotion gives ? 0'0369,* 

and witli the correction, F = 0'1U3. The hypergeoraetrio treatment, summing the prob- 
abilities of obtaining 3, 4 or 6 in the Overactive-Excessive category, gives P == 0'U66. 

If the more customary test of absolute value is applied, with Yates’s correotion gives 
F - 0*2286, S and the normal deviate gives P = 0'2434:, i.e. both values of P are doubled. 
The hypergeometrio treatment, adding the probability of obtaining 0 in the Overaotive- 
Exccssive category gives P =*= 0'2446, 

It will be seen that in conditions such as these, jS and the normal deviate give a reasonable 
approximation to the exact treatment, 

* This is making the common assumption that {(.il5){«6)-(At)(oi?)p J^/{(.4)(a)(B)<&)} ia diS' 
tributed afl with 1 degree of freedom, or that its square root is a normal deviate with sign depending 
on the sign of {ABj (<(&)-- (.46} (ofT?), 


references 

Chambebs, E, G. (1946). Statistical toohniques in applied psychology. Biometrikat 33, 269. 

KKNDALt., M. C, (1938). A new measure of rank correlation, B-iomeinfett, 30, 81, 

Kendall, M. G, (1946). Tha treatment of ties in ranking problems. Biomelrikat 33, 230. 

Peajison, E. S. (Z947|, The choice of atatistical tosta illustrated on tlie interpretation of data clasaecl 
in a 2 X 2 table. Biojneirifca, 34, 139. 

SiLLrrro, G. P. (1947). The distribution of Kendall's coeflloienfc of rank correlation in rankings con- 
taining ties, Biometrika, 34, 36. ’ 




[ 297 ] 


THE VARIANCE OF t WHEN BOTH RANKINGS CONTAIN TIES 


B-s- M. G. KENDALL 


1 . The variance of r in the population of aample permutations was given in my paper of 
1938 for the case where no tied ranks exist. Mr Sillitto (1947) has given the formula where 
one ranking contains ties but the other does not. In the foregoing paper Mr Whitfield has 
correctly sui'mised the variance when one ranlting contains ties, and the other is a dichotomy. 
In this note I derive the general formula for the variance when both rankings contain ties. 
The results of Messrs Sillitto and Whitfield then follow as special cages. 

2. I shall follow the method of Daniels (1944), If represents the contribution of the 
ith and jth members of a ranking to r we have 


^ii ^ + 1 (i < j) 


- 0 {i=2) 

-“1 (i>j) 

(i) 

We write 

where a and b refer to different rankings, and 

( 2 ) 

li 

• ( 3 ) 

The quantity c is simply related to 8 by the relation 

c = 2;8f, 

(4) 


and for the testing of t it is sufficient to teat cat which are merely constant multiples of t . 
I work with the quantity c. 


3. We have, from Daniel’s results, 


n 

z=i 




n 


S aff = 7t(?i-l), 
i, t«i 


n 




m = 0, 


( 6 ) 

( 6 ) 

(7) 

( 8 ) 
m 


If we substitute from (6) and (7) in (9) we find 

^(c^) = in{n - 1 ) 4* 6), (10) 

or, equivalently, JB(aS^) = 1) (2^ + 5,), (11) 

from which the variance of t in the case of untied rankings follows at once. 

4. Now suppose that sets of t^, ... consecutive members in one ranking are tied. In 

place of (6) we then have 


S flt|j = -?i{??.-l)— Si(i-l), 


( 12 ) 


^0.2 



298 


The variance of r when hoth rankings contain ties 


the summation on the right taking place over the various values of t. This result follows 
simply from the consideration that for a pair of tied ranks Uy == 0, and consequently the sum 
of squares of contributions from a tied set is of the same form as for the ranking as a whole. 


In place of (7 ) we have n 




(13) 


This is not quite so obvious, Consider a set of tied ranks. The contribution to the sum on 
the left of (13) will be unchanged if the suffixes Z, I fall outside this set. If they hoth fall 
inside, no contribution arises and therefore wo have to subtract the term — 1). The re* 
maining possibility is that one falls inside and one outside. In such a case the contribution 
remains unchanged in total for it is zero in the original untied case, each possible pair occur' 
ring onoe to give + 1 and one to give - 1. Formula (13) follows. 


5. By substitution in (9) we then have, for two rankings with ties typified respectively 


by t and w, 


E{c^] = 


n{n~ 1) (ti— 2) 


{W7i-l)(7i-2)-iSi(i^l)(Z^2)} 


X { - 1 ) 2) “ - 1 ) - 2)} 

+ ^ ^ ^ ^ ^ ^ ^uiu~ 1 )}. (14) 

This is the general formula required. Wc can express it in the alternative form 
E{c^) ^ f a(w ^ 1 ) {2n + 6) ~ f - 1 ) (2i + 5) - ^Sw(w - 1 ) ( 21 ^ + 6) 


C. (i) If one ranking is untied, say all the n's are zero, wo have Mr Sillitto’s result 

= |w(^^"l)(2?i+6)-fSZ(|{-l)(2i + 6). (l6) 

(ii) If one ranking is untied and the other is a dichotomy into a; and n — x ^ y members, 

(18) reducea to E(o2) ^ 1 ), ( 17 ) 

agreeing with Mr Whitfield’s equation (2}.-' 

(iii) If 6ne ranking contains ties and the other is a dichotomy we find on substitution 
in(U)- 

agreeing with Mr Whitfield's equation (3). 

(iv) Filially, if both variates are diohotomiaed into .u, y and find 


n-i 


( 19 ) 


agreeing with Mr Whitfield’s equation (4). 


REFEKEl^CES 

See the references to Mr Whitfield’s paper together with ; 

DANEEXfS, H, E. (1944). The relation between mefieures of correlation in the universe of sample per* 
mutations. Biom^trika, 33, 129. 



300 


A ^STtiooih' test for goodness of fit 

intotli 6 itligroup,andletmj(i - 1,2, A) be the expected number. It is possible thooretic- 
aliy for tie calculated for the case where 

^ E 

i-l i-l 

but such oases must be rare in statistical practice. We shall overlook this case and will 
consider the case where the totals of observed and expected are made equal to one another 
with the resultant loss of one degree of freedom in the calculation of If the totals agree thou 

h k k 

r "i- s = 0. 

1=1 ’i“l i*-! 

where Si = In order tliat the sum of these d's should be zero, at least one of tlieni 

must be negative in sign, but which one of these (5’s it will be would seem to be a matter of 
chance. It is on this fact that we shall base the first test oriterion. 

4. Suppose that we have a sequence of signs of which are positive and negative, 
where ri +r 2 = r, and > 0 and > 0. These signs are postulated to occur in a random order. 
Given such a sequence it is easy to record the number of sets of positive and negative signs. 
For example, if the sequence is 

then r ™ 16, != 9, fg == 6, and there are four sets of positive signs and four sets of negative 

signs. In general there can be (a) t positive, t negative, or {p) i positive, t + 1 negative, or 
(y ) f + 1 positive and t negative sets of signs, IfT — 21 or + 1 as required, we may ask what 
is the probability that given tq and such a number T of sets (alternately positive and 
negative) would have arisen through chance. This probability follows at once from Whit- 
worth, Choice and G/mnce, Proposition xxv, via.; ‘The number of ways in which n indifferent 
things can be distributed in t different parcels (blank lots being inadmissible) is 

6. The total number of ways in which and elements can be arranged is 

r! 

We now require to enumerate the number of Ways in which can be arranged to form t sets 
and rg to form t sets. To arrange in i sets is equivalent (vide Whitworth) to making / 1 
breaks in a sequence of rj observations, and this may be done in 

ways, 

and similarly for r^. It is not specified whether + or — should start the sequence, and hence 
the total number of ways in which a sequence may be arranged -in t sets each is 

2(r,-l)i(ra-l)[ 

(^-l)!(^-^)!(rl-^)!(r,-^)l‘ 

Since I first thought of this method of attnok I have found that the distribution of groups as 
given by mo in 1 5 has already beon given by W. L. Stevens, Ann. Eugen., Lond., 9, 10, and by A, Wald 
and J . Wolfowitv., Attn, Math, Statist, H, 147. The probability function has been tabled by F. S. Smed 
and C. Fisonliart, Ann, Math. Statist; 14, 06, but it is not in a form that I found suitable for my puv- 
poaas, The probability function has beon known for many years; what is interesting is the different 
uses to which it has boon put. 



F. N. David 


301 


The probability of 2t seta will be 

P{2t I r^} 


2riiK^l)!r,Ur.-l)l 

and the probability of obtaining (2i+ 1) seta will be 


P{2t +1 1 ri, r,} = I r„ « + 1 1 r,} + P{i I \r,j\ r,} - P{2f | r,} . 


( 1 ) 

( 2 ) 


Heoioe giv^n T from a random sequence of positive and negative signs the 

probability of such a number of seta having arisen through chance may be calculated. 

6. It is desired to use the probability of a given arrangement of signs in order to test a 
given hypothesis represented by a smooth probability law, bearing in mind that, if the given 
hypothesis is not true, then any alternative law is lilttdy to be of a smooth type. Although 
no exact definition of a smooth alternative distribution has been made, it may be stated 
here that smooihi in the sense used by hJeyman, will imply that the number of sets of signs 
will be small. For example, if the hypothesis tested is that observations follow a given normal 
curve, whereas in fact they have been drawn from a normal distribution identical with the 
first but with a smaller mean, then the differences between observation and expectation on 
the basis of the hypothesis tested may be expected to give a preponderance of positive 
signs below the sample mean and of negative signs above it; that is to say, if the difference 
in moans is sufiSoient to offset the sampling fluctuations we should find a single set of positive 
signs followed by a single set of negative signs. If the true population is a normal curve with 
the same mean but with a larger standard deviation than that specified by the hypothesis 
tested, then there will be a tendency towards a set of positive signs, a set of negative signs, 
followed by a set of positive signs, although sampling fluctuations may not leave such a 
clear-out answer. The more complex the alternative hypothesis the less oliance there will be 
of detecting it. 

7. With this objective in view it is proposed to take the number of sets of signs, as 
the test criterion, rejecting the hypothesis tested whenever, for a given and r^, T is excep- 
tionally small, This we do on the grounds that the existence of very few sots of signs suggests 
that the differences between observed and expected frequencies are not due to chance sam- 
pling fluctuations but to some systematic departure of the true probability law (assumed 
smooth) from hypothesis. In following this procedure we should reject the hypothesis if 


2 ’=. 2 


where % is the observed value of T and e the significance level selected aa appropriate. 
Exact probabilities are given in Table 1, and the application of the test is immediate.* 
There seems to be no reason why the test should not be applicable to both grouped and 
ungrouped observations, although the formulation of the hypothesis tested may be some- 
what different in the two cases. Consider a sample which has been supposedly randomly 


* An assumption implicit in the teat would apponr to be that for each y® cell there is an equal clmnce 
of obtaining a positive or a negative deviation, that ia, that there ore sufficient numbers in each cel) 
for the bitiomicd to be closely approximated to by a normal curve. An extensive series of random sam- 
pling experiments baa ahown, liowevor, that the divorgenoe between theory and practice ie nob aig- 
nificent even when the probability of obtaining a positive is four times that of obtoining a negative. 
Hence while strictly the expectation in each cell of should be 10 or over, it would seem that for 
practical purposes tliat the T teat may be applied in all cases whore the application of the test is 
permissible. 



Table 1. Probability of obtaining a given number of sets^ T. [?* - or 2i + 1] 


Tlxe function tabled is 


2(ri -1)1 ba - 1 ) ! _ (ri-l)l (ra-l)l (rj^-hrB-2i) 

(«~l)l(Pl)l (^-Ol {»*a-^)l ^ i l(f- 1)1 (?'!-’«)! (j-a-i)! ’ 


is even or odd. 

?’l 

P{T) is obtained by dividing tiiis bmotion by the binomial term • 


according os IT 









E. N, David 


303 

drawn from some population, Let tlie elements of the sample in order of drawing be 
a' 2 , i'll® ^ criterion to test the hypothesis of randomness, in the fol- 

lowing way. If Wj. is the smallest value observed in the samide and the largest, then if we 
exclude the trivial case when all the are equal it is easy to show that < » < m„, where 
_ 1 

a; = - 2 If now consider the deviations 

X{-x-SXi for t=l,2, 

there will be a series some of which quantities will be positive and some 

negative. The application of the T test is immediate, the admissible alternate hypotheses 
being that if the drawing of the sample is not at random then bias of the smooth kind is 
present. 

8, As an illustration consider the following two oases: 


Cose I. 

Expected frequency 

10 

26 

3B 

76 

L66 

166 

76 

36 

26 

10 


Observation 

12 

29 

45 

31 

160 

146 

69 

31 

20 

8 


Deviation 

4 - 

+ 

+ 

4 - 

+ 

- 

_ 

— 

— 

— 

Case 11. 

Expected frequency 

10 

26 

36 

75 

166 

166 

75 

35 

25 

10 


Observation 

12 

23 

46 

60 

161 

160 

69 

36 

20 

8 


Deviation 

+ ■ 

— 

+ 

— 

4 " 

+ 

— 

4 " 


— 

In the first case — 6*94 and in the second - 

6 ' 80 ; 

in neither ease 

would tile hypotiiesjs 


be rejected as inadequate by using the criterion. The T criterion docs, however, bring 
out the essential difference; 

Case I, ~ 5, ?'2 = 6, 5^0 “ 2 and iTJ} ~ 

Case II, - 5, j'g " fi, 0 = ® = Hi* 

Using the T test we should be inclined to reject the first hypothesis in favour of a smooth 
alternative, while for the second case we should be inclined to agree with the conclusion 
drawn from the that the observational material is adequately described. 

9, Sampling material is available whereby the theoretical distribution of T may be tested 
in practice. Neyman & Pearson (1928) took 208 samples, each of size 200, from a population 
of eight groups described by the cubic ourve 

The expectation in each cell for a sample of this size was calculated and the;^^ criterion found 
for each of the 208 samples. The wi'iter was given access to these calculations and was able 
to find the sampling distribution of ^ from the material. The results of this sampling experi- 
ment and the theoretical distribution of T from relations (1) and (2) are given in Table 2. 

The agreement between theory and practice would seem to be reasonably good, and in 
the cases (4, 4) and (6, 3) the values of calculated to test the discrepancy between theory 
and practice, were not greater than might be attributable to sampling fluctuations: It was 
not thought worth while to calculate x^ fo^' (®> 2) and (7, 1). A second sampling experiment 
in which samples of size 3C0 were drawn fr'om a normal population of fifteen groups lent 
further support to the reasonableness of the theoretical distribution. 

10. The T criterion will be a useful supplementary criterion to the because it 

takes account solely of the sign of a distribution and not of its magnitude it will probably 
only be useful when used in conjunction with x’^^ A test of significance which could combine 
botli the probability levels of T and x^ would undoubtedly be more useful, and we may 



304 A ^smooth" test foi' goodness of fit 

ther,efore consider how thie might be done. Unless the exact degree of dependence which 
exists between two variables is known it is nsnally only possible to obtain their joint dis- 
tribution if they are independent. It would appear reasonable, both on theoretical grounds 
and from sampling experiments, to assume that T and are independent, or, if the assump- 
tions underlying both tests are not exactly fulfilled, to assume that the degree of dependence 
between them is at most small. 


Table 2. Companson of theoretical distribution of T with 
that derived from a sampling experiment 

(4 positive, 4 negative) 


T = number of seta 

2 

3 

4 

6 

G 

7 

8 

Total 

Sampling 

3 

5 

20 

25 

23 

5 

7 

03 

Theory 

2'7 

S-0 

23'0 

S3'9 

23'9 

S'O 

2'7 

93 


<6 positive, 3 negative) or (3 positive, 5 negative) 


T “ number of sets 

2 

3 

4 

6 

6 

B 

Total 

Sampling 

2 

8 

32 

30 

20 

10 


Theory 

m 


29-1 

20-1 

21*9 

7'3 



(6 positive, 2 negative) or (2 positive, 6 negative) 


T =: number of seta 

2 

3 

4 

5 

Total 

Sampling 

1 

3 


5 

0 

Theory 

0-6 

1‘9 

3*2 

3-2 

9 


(7 positive, 1 negative) or (1 positive, 7 negative) 


T = number of sets 

2 

3 

Total 

Sampling 


2 

2 

Theory 

0‘6 


2 


11. We, shall begin by demonstrating that as far as mathematics are concerned the T 
and X® criteria are completely independent,* l^or simplicity of argument let us consider the 
case of three groups only. The sample may then be represented by a point {ni^n^.n^) in 
three-dimensioned space, with axes of reference Oui, On^, On^, and the expected population 
values by a point (Wi,m 2 ,'m 3 ) in the same space, Since 

TJ'i+Tia + Ma = = N, 

* this method of approach was suggested, to me by Anclrew Gleason of Harvard University at a 
seminar given at the Statistical Laboratory, University of California, at Berkeley. 











F. N. David 


305 


these points are constrained to lie in a plane. Fig, 1 shows this plane for the particular case 
^ = 16; Till = =; 8 , m 3 = 4. Since no frequency can be negative, possible sample points 

must be within an equilateral triangle lying in this plane, the chance of oeourrenoe associated 
with a point being the multinomial term 

N[ 

nitngjngl W W ' 

When using the test the mathematical approximation consists in substituting for this 
term an expression proportional to in regarding this last as a continuous function, and 



]?ig, 1 . Quaphical illusti'ation of the contours and the change in signs of the 5/i’b. 
Wj, Wj and fis denote the points of intersection of the On^, On^, On^ axes with 
tJie plane = iV". According to the approximation, tho ohanoe equals 

a of obtaining a sample point lying outside the elliptie contour on which x - Xa‘ 


in taking €ts a measure of goodness of fit the integral of this expression outside the elhpse 
which passes through the sample point and on which ia constant. For the case of three 
groups this integr’ftjf itself assumes tho simple form Three such elJiptie ooiitoutf) are 
shown in the diagram. 

Planes through (wtj, W 3 ) parallel to the oo'ordinate planes Oniy will 

intersect the sample plane 














306 


A f ^s'mootk'' test for goodness of fit 


ill three straight lines. As shown in the diagram, these lines divide tlic sample plane into six 
sectors, and for all sample points within a sector the signs of tlio differences dUi ^ ni^irii 
will remain unchanged. Any test based solely on runs of signs will consist in taking one or 
more of these sectors as critical regions and rejecting the hypothesis te.sted when the sample 
point falls therein. It is clear that if we use the mathematical approximation, the distribution 
of is the same within each sector^ similarly, that the chance of a sample showing a given 
combination of signs is the same on each ellipse along which is constant. Thus under the 
assumptions made regarding the distribution of x^> the T and criteria are completely 
Independent, 

In this case of three groups T can only assume values of two or three and the former value 
would not be judged significant, but the argument will follow exactly similar lines in the 
case of many groups. The number of sectors will be in general 2(2r2+ 1) if Ti>r^ and 2(2ra) 
if Ti = r^y and they will be bounded by primes passing through tlic population point. 

12. While the distributions of T and independent for this mathematical model 

they are unlikely to be exactly so when we go back to the true multinomial density dis- 
tribution, because the sample space is neither continuous nor infinito. Tiro model, in fact, 
becomes inaccurate if Wi, or are very small. Por example, it is seen in Pig, 1 that while 
the 1 % ellipse (x “ Xo^oi) completely within tlie triangular space for the sectors with 

signs H h and + , it lies completely without the space for the sector + H — and 

partly without for the other seofcora. It has been thought worth while therefore to test 
whetlier the two criteria are independent in practice, and to this end the same material 
previously described has been utilized. Tables 3 and 4 give the distribution of mean 
for different values of d^{T} and the distribution of mean P{T} for grouped values of 
There is little evidence in these figures to show tliat P{?'} and (and therefore P(x^}) are 
related. The figures therefore lend support to the geometrical argument and indicate that 
the approximations involved in x^> holh from the small sample and the fact that the sample 
space is not infinite, do not invalidate the mathematical result. 

13, In order to combine the x* O'nd T tests of significance it will be necessary to develop 
a theory for the combination of two teats of significance when one criterion is a continuous 
and the other a discontinuous variable. It. A. Pisher has set out the test for the combination 
of tests of significance from a number of independent continuous variables. The keystone 
of the test is the recognition of the fact that if Z is a continuous variable, then where 

=; f 'p{Z)dZy 


is also a continuous variable equally likely to have any value between 0 and i ; we shall 
describe z as being distributed rectangularly. Twice the logarithm of the product of two such 
z's^ say Zi and 2^, where and follow from two independent tests of significance can be 
shown to be distributed as x® with four degrees of freedom. Consider a discontinuous variable 
X which may take values and which has an elementary probability law 

P{JC = J,} = where 

and S ft = 1. 

i=i 

If a new variable, a:, is defined os taking values x^y Xg, . , aj^, where 

k 

=« S Pp 

J»1 



F. N. David 


307 


Table 3. Mean x^foT different values of P{T} 


I'l, ^'2 or r^, J'l 

^ .. 

4,4 

6, 3 

4,4 

6,3 

6, 2 

4,4 

6, 3 

4, 4 

6,2 

No. of obs. on wliicli 

26 

6 

20 

28 

30 


26 

1 

32 

20 

3 

meat! is based 









P{f) 

1-00 

0-07 

0>93 

0-80 

0-71 

0-04 

0-63 

0-43 

0-37 

0-29 

lie an 

L 

7-L9 

6-04 

6-87 

6-43 

6-98 




7-41 

7-32 

6-82 

4-86 


ry i\ or rj, 

7,1 

6, 3 

4,4 

6,2 

6, 3 and 4> 4 

No. of oba. on which 

— 

8 

6 

1 

6 

mean is based 






pm 

0'26 

0-14 

0-11 

0-07 

0-04 and 0-03 

Mean y* 


Q-10 

6-53 

0-96 

8-00 


Table 4. Mean P{T} for grou/ped 



O'O-l-O 

l-0^2-0 

2-0-3-0 

3-0-4-0 

4-0-6-0 

6‘O-e-O 

6'0-7-0 

7-0-8-0 

0 

1 

o 

O-O-IO'i 

No. of obs, on which 

1 

7 

14 

27 

32 

29 

21 

14 

11 

IB 

mean is based 











Mean P{T} 

0-71 

0-73 

0-09 

0-60 

0-68 

0-60 

0-66 

0-B7 

0-67 

0-64 



10-0~ll-0 

11-0-12-0 

12-0-13-0 

13-O-U-O 

14‘O-IB-O 

15-0-16-0 

16-0-17-0 

17-0-18-0 

No, of oba, on winch 

10 

10 

6 

4 

2 

i 

2 

1 

mean is baaed 









Mean P{T} 

■ - -1 

0-74 

0-66 

0-74 

0-58 

0-82 

1-00 

0-63 

0-63 


then uCf^ may only take values between 0 and 1 for A i, 2, It is required to find the 
joint probability law of the product of two independent variables x and where x and z 
are as defined above. It will hib noted that the elementary probability law of a; will be 

■Pfx ^ Xj^ ~ Pj 0 “ 1/ 

Hence when a; = ajy (the probability of which is the product xz will he distributed rect- 
angularly between 0 and Xj on a proportion of occasions. It follows that xz has a probability 
distribution which has pointsj of discontinuity at Xi,X 2 , - that it is distributed rect- 
angularly between these points of discontinuity, and that 

P{0<a;2<rti} = Pi S P[xi<xz<xf; = S 

■PK-t <**<*«.} = ?* 2 


Generally 
















308 


A f 'smooth' test for goodness of fit 

14. If wo now apply this theory to the combination of the testa of significance of T and 
it is seen that we must consider the product of P{yf} and P{T], is ^ continuous 
variable and 

2 - L ^ P{x^>x^} = 

J Xf) 

is distributed rectangularly between 0 and Ij and 

S’*- a 

is a discontinuous variable talcing known values. The probability integral of xz is thus known 
from theory and rQ.(]g or can be found to satisfy the relation 

P{0 <xz< Tg} «=* e. 

These probability levels are given in Table 6. The procedure for the joint test of significance 
will be: 

(i) calculate F{T} as described in § 7; 

(ii) calculate P{x^} in the usual way. The degrees of freedom will be the number of groups 
minus one; 

(iii) multiply P{T] and together and refer to Table 6 to judge the significance of 
the product. 


Table 6. Fakes of 7o,o5 md TQ,Qiy where P{P{x^) P{T) < 7^} = e 

This table may be used to judge the 8igni6oanee of th& joint distribution 
of tlie T criterion and any other oontiniioua oriterion. 


■ 

HI 

B 


^001 

■ 

D 

m 


^ 0.01 

6 

4 

1 

0 - 0312 * 

0 - 0062 ® 

11 

10 

1 

0 * 0276 + 

0 - 0066 + 


3 

2 

0-0213 

0-0043 


9 

2 

0-0171 

0-0034 







8 


0-0144 

0*0028 

6 

6 

1 


0-0060 




0-0144 

0 - 0026 + 


4 

2 

H 

0-0042 


6 


0-0140 

0-0024 


3 

3 


0-0039 











12 

11 


0-0273 

0 - 0066 - 

1 

0 


0-0292 

0-0068 


10 

2 

0-0174 

0 * 0036 - 


6 


0-0197 

0 - 0030 ® 


9 

3 

0-0149 

0-0027 




0-0174 

0-0036 


8 

4 

0-0142 

0*0024 







■■ 

6 

0-0136 

0*0022 

8 


1 

0-0286 

0-0067 


Wm 


0-0131 

0*0021 



2 

0-0188 

0-0038 


■1 



' 



3 

0*0180 

0-0032 

J 3 

mm 


0'£>371 

0 - 0 P 64 , 



4 

0-0163 

0-0031 


mm 


0 * 0166 + 

0-0033 







h9 


0-0161 

0-0026 

fi 

8 

1 

0-0281 

0*0066 


9 


0-0138 

0-0023 


mm 

2 

0-0180 

0-0036 


8 


0-0137 

0*0022 




0-0163 

0-0031 


mm 

0 

0-0138 

0 - 0022 ® 




0-0140 

0-0028 







■I 




14 

■9 


0-0269 

0-0064 

10 

n 


0-0278 

0-0066 


mm 

2 

0-0163 

0-0033 


8 

2 

0*0176 

0-0036 


BIH 

3 

0-0161 

0 - 0026 + 


7 

3 

0-0143 

0-0029 


msM 

4 

0 - 0136 “ 

0 - 0022 ® 


6 

4 

0-0143 

0-0026 


9 

6 

0-0138 

0-0023 


6 

6 

0-0143 

0-0025 


8 

6 

0-0130 

0-0022 







m 

M 

0-0134 

0-0022 



















F. N. David 


309 


16, The application of the joint test of significance may be illustrated by means of an 
example. A sample of 360 observations is available. This sample has aotually been randomly 
drawn from a normal population of which the mean is zero and the standard deviation unity. 
The figures are given in Table 6. Calculations give 21-1 and P{x'‘'} = OTO. Judging 
by the alone we should say probably that there is nothing out of the ordinary in the 
deviations of the sample from the expected values. The number of signs is 16, of which 9 
are positive and 6 negative, and these are arranged in six sets. Making the appropriate 
calculations, we have 

P{6 seta I 9 positive; 6 negative} - = 0*176, 

The arrangement of signs will therefore be judged as acceptable. The joint significance of a 
P{X^} = 0*10 and a P{P} = 0*176 is found, by evaluating the joint distribution, to be 0*066. 


Table 6. Sample values. Observed and expected 


Central values 

-2-1 

and under 

-1*8 

-1*6 

--1.2 

-0*9 

-0-6 

-0-3 

0*0 

Observation 

12 

10 

18 

26 

23 

42 

IM 

49 

Expeotation 

9*3 

8*6 

14*0+ 


28-7 

3B-0 

H 

43*0- 

Deviation 

4" 2*7 

+ 1'4 

+ 4*0 

+ 6*0 

-6*7 

+ 6*1 

+ 2*0 

+ 6-0 


Central values 

— 

+ 0*3 

+ 0-6 

+ 0-9 

— 

+ 1*2 

r 

+ 1-6 

+ 1-8 

+ 2*1 
and over 

Total 

Observation 

36 


20 

20 

mm 

3 

6 


Expectation 



28*7 


■1 

8*6 

9*3 

360 

Deviation 

-6*0 

-7*0 

~8-7 

+ 6*0 

+ 6-0 

-6-6 

-4*3 

0 


16. A study of the basic table (Table 1) of the function T will show that P{T} is not a 
very sensitive criterion with whioh to judge the randomness of a sequence of signs unless 
the number of groups under consideration is very large. Eor example, if there are 10 signs, 
6 of whioh are positive and 5 of which are negative, the probability of getting two sets of 
signs is O'OOS, Thus the test would show, and rightly, that the chance of such an arrangement 
is small, but this fact would undoubtedly be recognized by a skilled computer without the 
use of a test at all. In the case of 10 signs the probability of three groups or less is 0*040, 
and this would possibly be judged non-signifidant. Again, let us consider an extreme case say, 
10 signs, 9 of whioh are positive and 1 negative. The T criterion does not concern itself with 
the fact that the numbers 0 and 1 are exceptional, it is merely concerned with deciding whether 
their arrangement is exceptional given the 9 and 1. Table 1 shows that neither possible 
arrangement would be considered out of the ordinary. It is these points of weakness whicli 
show that the criterion T is hot of great utility except in combination with For, if we 
consider the 9 positive, 1 negative case, common sense tells us that the criterion in such 



























310 i t 'smooth^ test for goodness of fit 

a case would possibly be significant, Nine positive deviations have to be balanced -by a 
single negative deviation, and this last is therefore likely to be big. This does not influence T ; 
neither will the contribution of T to the j oint criterion be of much weight. This is as it should 
be, for it is difficult to see how one can postulate a smooth alternatk e for 9 positive, 1 nega- ' 
tive, two sets, and not also for 9 positive, 1 negative, three sets, Generally, however, we 
shall not meet such extreme cases in practice. One way of overcoming this weakness of the 
test would be to consider the probability of obtaining positive and negative signs together 
with the probability of obtaining T sets of alternate positive and negative signs given r ^ and r^. 
This is simple enough when considering just a sequence of alternatives, as I have shown 
elsewhere, but it is not easy to fit these results to the problem, nor, when this is possible, 
will the choice of a critical region be straightforward. However, the results of sampling 
experiments will be utilized to throw light on these points and it is hoped to discuss them, 
with other questions arising, in a further publication. 

17. It is possible that there are other criteria, depending on the arrangement of positite 
and negative signs, which will be more sensitive than the T criterion chosen, For example, 
it is easy to calculate, given and fj, the probability that the largest set is composed of a 
sequence of r' positive signs, and there are many other possibilities which might be con- 
sidered. It would appear that any criterion based on sign sequences can be shown to be 
independent of by means of geometrical argument, and it will be necessary therefore to 
consider the power of these different sign tests when referred to a specified set of alternate 
hypotheses. 

18. The main objection to the two criteria, T and P{x^},P{T}, that I have proposed in 
this note is the one which was mentioned earlier; they are only applicable to the case where 
there is just one restriction on x\ i-c. when the totals of expected and observed frequencies 
have been made to agree. It is possible to work out a slightly different form of the T criterion 
for each additional restriction which is put on y®, and this has been done. It is preferable, 
however, to delay publication until the results of an extensive sampling experiment are 
complete in order to verify whether such theoretical assumptions as have been made are 
reasonable. 


REFERENCES 

Neyman, J. (1937). Shand. AHmr. Tidslr.2Q, 149-99. 

Neyman, j, & Pearson, E. S. (1928). Bioneirika, 20 A, 263-94. 



[ 311 ] 


AN EXACT TEST EOR THE EQUALITY OE VARIANCES* 

By R. L. PLACKETT, M.A. 

Introdtjction 

The problem of testing the equality of variances and covariances in normal distributions 
is one "which has received considerable attention; we have compiled a bibliography of some 
sixty papers, and shall issue a survey of these in due course; only papers vital to our discussion 
will be considered here. A precise instance of the type of situation we are considering is as 
follows ; measurements of height, span and tibia length are made on each of 20 Englishmen, 
20 Scotsmen, 20 Welshmen and 20 Irishmen; it is required to know if the covariance matrix 
of the three characteristics is the same for each of the four nationalities. Nothing is known 
or assumed about the mean values of these characteristics in the four populations con- 
sidered, nor are we interested in testing any hypothesis concerning the means, although 
such a h 3 q)othesis may be the object of further investigations which assume that the four 
covariance matrices are the same; this latter assumption is inevitably made in multivariate 
analysis of variance. 

Wilks (1932) has already given the moments of the distribution of his criterion for testing 
the equality of several covariance matrices (on the hypothesis that the matrices are in fact 
equal) and Bishop (1939) put this criterion into an approximate workable shape. The test 
criterion given here differs from that of Wilks and has the advantage when one or two cor- 
related characteristics are being measured (height or height and span, for example) that its 
distribution is exactly known whatever the number of populations. Nair (1939) did, it is 
true, give the exact distribution of the Neyman & Pearson (1931) criterion for one mea- 
sured characteristic; and the exact distribution for two characteristics of Wilks’s generaliza- 
tion of their criterion; but the form in which the distribution was obtained is very involved. 
It is interesting to notice that from our standpoint the problem of testing the equality of 
several variances (i.e. the case of one measured characteristic) is, as will appear, brought 
"within the framework of multiple correlation theory. In the general case of more than two 
characteristics the moments of the distribution of our criterion, like those of Wilks, are 
available. 

Outline oe method 

In the usual terminology we consider k ;p-variate normal distributions and are concerned 
with testing the hypothesis that the corresponding variances and covariances are all equal. 
The method we employ to test this hypothesis is essentially that which has been in use in 
analysis of variance since its origination by Fisher; to test the equality of a set of k quantities 
we test whether ()t — 1) orthogonal linear functions of the quantities are each zero. To illu- 
strate the application of this principle in the present instance take the particular case p == 1, 
i.e. we wish to test the equality of the variances in k univariate normal distributions. If 
a typical observation from the Ith distribution is {I = 1, 2, ..., k), form k mutually ortho- 
gonal linear functions of the such that one is 

u = -p <2 + • ■ • + 

* Coinniunication from the BTational Physical Laboratory. 


Biometrika 34 


21 



312 An exact test for the equality of variances 

If the (^ -1) covariances of u and each of the other linear functions are all zero then the 
variances of the k distributions must all be equal; this condition may be expressed by saying 
that the multiple correlation coefficient of on the othei* Hnear functions is zero. lurther, 
if there are n sample values of u then the size of sample drawn from each distribution must 
also be n at least, and if no observations are to be discarded the size of each sample must be 
n exactly. Thus, although it is not a condition of the problem that the sizes of samples drawn 
from the k distributions must all be equal, it is a condition of our solution. 

The extension of the foregoing principle to p > I is straightforward and is considered in 
detail in the next section; the problem then becomes that of testing the independence of two 
groups of variates, the first of size p, i.e. p expressions of the form u\ and the second of size 
pik-l) comprising all the other orthogonal linear functions. This problem has been treated 
by Wilks (1936, 1943) and the relevant distribution is expressible as an incomplete /^-function 
when p - 1 end 2. (fns kV, a-f?- exant dlstribuLtion. la alao known- when p = 3 and 4 for fe = 2 . 
Finally, since when p = 1 the criterion has the form of a multiple correlation coefficient, the 
power of the test in this instance can be calculated by virtue of the work of Fisher (1928). 


Discussion of the test 

A sample of n observations is drawn from each of the k p -variate normal distributions of 
which the Jth has the covariance matrix Vij (I = 1,2,..., k\ i,j = 1, 2, ...,p). It is required 
to test the hypothesis that 

= {l,m = 1,2,. ..,k). (1) 

The population means do not enter into the hypothesis and have arbitrary unknown values. 
Where i, j, I, m appear henceforth they will be understood to range over the values given 
above unless otherwise stated. The observations may be -written in the form of an nxkp 
matrix X such that all those on the ith variate in the fth distribution are in column {i~l)k + l. 
The ath observation in this column (a = 1 , 2, . . . , w) is denoted by x\a , ; the order of the elements 
in a column is assumed to be random. If this is doubted the observations should be randomly 
rearranged. 

We must emphasize here that the sample value of the criterion to be used to test (1) 
depends on this order, and there is thus, in a sense, a correspondence between x\a and xf^, 
although these two quantities are, of course, uncorrelated when ? + Most tests of a 
hypothesis specifying nothing about the order in which observations are made or -written 
do-vra are themselves independent of it; ours is not, and different computers with the same 
data might well come to different conclusions although this does not affect the validity of 
the test, the significance level being overall what it should be. There is probably some loss 
of power which can, however, be offset by imbuing a with a certain physical meaning; but 
we shall not discuss this question here. A criterion for testing normality depending on the 
order of arrangement of observations has been suggested by R. C. Geary (1935, pp. 316-17). 



and let the corresponding nx kp matrix be Z. If G = Z'Z, where a prime is used to denote 
the transpose of a matrix, then, apart from a factor n, Q is the matrix of sample variances 
and covariances of all variables. We further define S{k,p) as the sum of all (k^^) signed minors 



R. L. Plackett 


313 


formed by rows l^,, and columns to^, mj, Wj, of G, where 

(3) 

S{k,p) is similarly defined for the matrix S' = 0~'‘- (we shall use this notation for the inverses 
of matrices throughout). 

We now proceed to prove the following 

Theorem: W{k,p) k^Pl8{k,p) B{k,p) 

is distributed like Wilks’s statistic for testing the hypothesis that two groups of variates 
of sizes p and p{k— 1), known to have been drawn from a (/fcp)-variate normal distribution, 
are mutually independent (Wilks, 1935, 1943). Ifthe groups are in fact mutually independent 
then (1) is true. 

Proof. Introduce Sbkxk orthogonal matrix B, the elements of whose first column are all 
equal (to + 1 I^Jk) but which is otherwise quite arbitrary. Put 

r — {i—l)k+l, u = {m — l)p+j, (4) 

and form a kp x kp matrix A such that 

“ru — (®) 

where S(j = I {i —j), otherwise 0. Clearly J is also orthogonal. For example, suppose 
k — 4:, p = 2. Apart from a factor of ± ^ multiplying each element, let 

B= 1 1 1 1 . 

11-1-1 
1-1 1-1 
1 - 1-1 1 

Then A=10101010. 

10 1 0-1 0-1 0 

10-1 0 1 0-1 0 

10-10-1010 
0 10 10 10 1 

01 0 1 0-1 0-1 

01 0-1 0 1 0-1 

0 1 0-1 0-1 0 1 

Whenp = 1, A = B. Let 



D^XA, Y^ZA, <7= r'r = A'(?A. 


(6) 

Putting 

s = {j — l)k+m, t=(Z-l)p + L 


(7) 

and defining 

t' — (l—l)p + i, u' - {m--l)p+j (Z,m = 2, 3,, 

...,k). 

(8) 

we have 

^i9ra) ~ 1) 


(9) 

so that 

<^{CiJ = {n-l)^b,^Vy^k. 


(10) 

Hence when 

= 0 


(11) 

equations (1) 

are satisfied, because for fixed i and j equations (10) 

can be solved and yield 


{n-l)V{j = <S’(cy. 


(12) 


ZI-2 



314 


An exact test far the equality of variances 


Denote a typical element of the rth column of D by Then equations (H) are satisfied if 
and only if and are mutually independent. 

A criterion for testiiig (11), obtained by likelihood-ratio methods, has been given by Wilks 


(1936, 1943). This is 


W{k,p) = 


nu I 


^ij 1 


(13) 


and is sometimes called the vector alienation coefficient. Let "be the pth compound of 0 
(Aitken, 1939, p. 90), i.e. the matrix of all p xp minors of O; and (JO’) the pth compound of 
0 = G~^ (since is the inverse of our notation is consistent). Then 

W{k,p) = l/c®c<S' (14) 

by an apphoation of Jacobi’s theorem on the minors of the adjugate (Aitken, 1939, p. 97). 
Now by the Binet-Cauchy theorem (Aitken, 1939, p. 93), 

C^v'i ^ = (A')0’)(3(P)A(P). (15) 


Consider the elements in the first row of (A')^h The first p rows of A' are of the form 


11. 

1 

00.. 

,.o 

00. 

,.o 

o 

o 

.0 

00. 

.0 

11 ., 

.. 1 

00. 

..0 

... 00., 

,.o 

00. 

,.0 

00. 

,.0 

11. 

..1 

... 00., 

,.o 

00. 

,.o 

00. 

..0 

00. 

..0 

... 11., 

,. 1 


apart from the factor ± 1 /a/A: multiplying each element. Therefore the only non-zero elements 
in the first row of {A')^'^ are those formed by taking one column from each of the p blocks 
of k columns into which the first p rows of A' may be divided. All the non-zero elements 
equal k~i^. Then from (16) 

= S{k,p) k-P, == S{k,p) k-P, (16) 

so finally W{k,p) = k^PlS(k,p) S{k,p). (17) 

This completes the proof. 

Case of p = 1 

Here W{k, 1) = F/S(A:, 1) S{k, 1), 

where 8(k,l), S{k,\) are the sums of. all elements of Q, respectively. If (1) is true, 
W(ifc, 1), the true value of If (fc, 1), is unity. Define 

lf(fc,l) = 1-^2 and W(A:,1) = 1-R2, (18) 

so that if (1) is true, R == 0. The distribution of R® = 1 - 1T(A:, 1) when R = 0 is, as Wilks 
pointed out, wellknown, being that of the multiple correlation coefficient (of on dg, dg, . . . , d^,) ; 
if in the usual notation 

Ua, b) = [B(a, 6)]-i dx, (19) 

then the cumulative distribution function of a: = is IAk — l,n — k), values near 1 being 
significant; that of a; = lf(A:, 1) being IAn — k,k—l) with small values significant. Tables 
in convenient form have been calculated by Thompson (1941); otherwise we can convert 
to the variance-ratio F by 


F = (n-k)[l~W)l{k-l) W. 


(20) 



315 


R. L. Plaokbtt 

It is clear tliat n must exceed h\ for p variates, n exceeds 'plc in order that Q may be non- 
singular. 

If the matrix A is defined instead BjS a, kpx hp orthogonal matrix, the elements of whose 


first column are 

all equal (cf. equation (6)), the problem is effectively reduced to the 

case 

p — 1 whatever the value of p, and we can test exactly the somewhat indefinite hypotheses 


F'ii + V\^ + ... + = Vfl+V^,+ ... + V^. 

(21) 

This may be applied in the following manner, for take k=p = 2 and obtain 



Fii+ Fia = VI.+ FL = Ffi + Fla = Vl^ + Vl^. 

(22) 

Thus 

F|i = 7Ja and F|i = Fi^. 

(23) 

If it is assumed 

Fii = Ff„ 

(24) 

then 

Fi2=F!a, 

(25) 

and conversely. 

Case of p = 2 



The distribution of W{k, 2) has been given by Wilks (1936). If a: = ^l[W(k, 2)], the cumula- 
tive distribution function of cc is 

I^{n-2k,2k-2). (26) 


Small values of x are significant and n must exceed 2k. 

Case of 

For k = 2, p = B and 4, the exa,ct distributions are again known and have been given by 
Wilks in equations (36) and (37) respectively of his 1936 paper. The expressions are rather 
complicated and we have not reproduced them here. For other values of k and p the moments 
of W{k,p) are available; while more recently Wald & Brookner (1941) have obtained the 
distribution in the form of an infinite series, calculating numerical values for the coefficients 
in certain instances. 

Forp>l, (17) becomes rather intractable as a means of calculating W{k,p). Indeed, 
for k = 2 and p = 4 it is necessary 

(i) to calculate 36 sample variances and covariances, 

(ii) find the inverse of an 8 x 8 matrix, 

(iii) calculate 512 4 x 4 determinants, 

and it is clearly better to reintroduce the matrix A in some appropriate numerical form, 
calculate 7 = ZA and C = 7' 7, and find 17(2,4) from (13), a process which involves the 
evaluation of an 8 x 8 and two 4x4 determinants. 

PoWEE OE THE TJiST WHBK p = 1 
From (17) the true value of is in general given by 

1 - R2 = Fix) (s 1/Fl,)] , (27) 

and thus the test will have equal power for all values of the variances such that the product 
of their sum and the sum of their reciprocals is constant. Consequently 1 — T7(ifc, 1) is dis- 
tributed like the multiple-correlation coefficient in samples from a population where the 



316 An exact test for the equality of variances 

true value is given by (27). The probability density function of this distribution has been 
deduced by Fisher (1928) and can be integrated to give a finite series when {n - k) is even. 
We find easily vt^hen fc = 2 that in the Y\^ quarter-plane the equipotentials are pairs 

= and = Ffi, (28) 

where a = (I H- R)/(l-R). (29) 

For ik > 2 the equipotential surfaces in k dimensions are cones through the origin situated 
symmetrically with regard to the co-ordinate primes. 

Reverting to b = 2 three methods are available for testing the hypothesis that FJi = Vh: 

(i) Fisher’s z or F = exp (2z) 

= gii!92i- ( 30 ) 

(ii) the Lj criterion introduced by Fearson & Neyman (1930) and later extended to 
fc > 2 (Ne 3 nnan & Pearson, 1931). 

In the instance we are considering, i.e. equal sample sizes from both populations, 


Li = ( 31 ) 

= 2FV(1 + F). -(31a) 

(iii) If (2, 1) = ^[gngz,-{9M{gn+gn ? - (2^12)*].=^ (32) 

Thus tests (i) and (ii) are exactly equivalent, as is known, the optimum critical region being 
that corresponding to equal tails of the F-distribution. Criterion (iii) is that obtained by 
Morgan (1939) and Pitman (1939), appearing as equation (12) in Morgan’s paper, to test 
that the variances in a normal bivarate population are equal. Morgan has compared the 
powers of tests (i) and (iii) for n = 12, 26 and 100 at a significance level of OdO and for these 
sample sizes it appears that the tests are effectively of equal power. 

When n is large and, consequently, the two populations being independent, (sri2)®/9u92a 
is converging in probability to zero, 

]f(2,l)~Lf. (33) 


The cumulative distribution functions of criteria (ii) and (iii) are respectively 

(Nayer, 1936) (a; = L|) and (® = If (2, 1)). Generally, W[k, 1) for large n con- 

verges in probability to the harmonic mean of the sample variances divided by their arith- 
metic mean; L-^ (for equal sample sizes) is exactly equal to the geometric mean divided by 
the arithmetic mean. 



Example op the use oe the test eoh a case with k = ^,p = l 

It is not easy to calculate If (fc, 1) from equation (17) if fc > 3. Indeed, the main value of (17) 
lies in showing the form of solution, and in establishing that this is independent of the 
particular orthogonal transformations used. In the following example, therefore, orthogonal 
transformations are made at once and the multiple correlation coefficient is calculated from 
the- numerical data. This procedure is far qtucker than that involved in calculating If (4, 1) 
from (17). 


* See Appendix. 



R. L. Plaokett 


317 


Below are given samples of 10 from each of four univariate normal populations: 




*3 

»4 



X3 


-20 

-1-24 

-f 4 

-f52 

+ 7 

4-16 

4- 8 

- 8 

- 1 

-f-18 

-i- 9 

-24 

+ 5 

4-24 

- 1 

4-66 

-11 

4-27 

-27 

0 

4-18 

-12 

4- 1 

-64 

-hlO 

-1-21 

-h 6 

"}~ 48 

4-13 

-24 

- 4 

4-12 

- 4 

-48 

- 3 

-t-48 

- 6 

4-12 

4 - 6 

-12 


which have mean zero and standard deviations respectively 10, 30, 10, 40. Make the fol- 
lowing orthogonal transformation: 


Hz ~ x^ — x^ + Xg — = x-j^ — X 2 — x^ + x^, 

and obtain 


3/1 

3/2 

3/3 

3/4 

3/i 



3/4 

4-60 

-62 

-92 

4- 4 

4-22 

+ 22 


-24 

4- 2 

4-32 

4-14 

-62 

+ 84 

-26 


+ 38 

-11 

4-43 

-65 

-11 

-67 

+ 69 


-57 

4-84 

-22 

-64 

4-32 


-19 


+ 63 

- 7 

-97 

- 7 

4-96 


+ 13 




-36 


Perm the matrix of sums of squares and cross-products, i.e. 0 . This is 



+ 18636-1 

- 9646-9 

-18232-9 

+ 7325-1 



- 9646-9 

+ 21784-1 

+ 12018-1 

-19015-9 



-18232-9 

+ 12018-1 

+ 28692-1 

- 9445-9 


+ 7325-1 

- 19015-9 

- 9445-9 

+ 22008-1 

The matrix of sample correlation coefficients 

is therefore 



1 

- 0-4788 

-0-7886 

+ 0-3617 


' -0-4788 

1 

+ 0-4807 

-0-8686 


-0-7885 

+ 0-4807 

1 

-0-3769 


+ 0-3617 

-0-8685 

-0-3769 

1 


Hence the multiple correlation coefficient of on t/j. 2 / 31 2/4 i® given by = 0- 637 . Calculated 
by the approximation indicated in the last paragraph of the preceding section* = 0-656; 
the true value obtained from equation (27) with variances in the ratio 1 : 3 : 1 : 4 is 0-727. 
The upper 10 and 6 levels of significance, obtained from Thompson’s tables with 
= n — k = 6, Jig = 1 = 3, are respectively 0-622 and 0-704. We find = 0-665, the 

5 and 1 % levels obtained from Nayer’s (1936) tables being respectively 0-797 and 0-719, 
so that this test gives a more significant result than the one based on B®. The relative merits 
of and the test we have provided, which cannot be judged on the results of one example, 
remain a problem to be investigated. 


* I.e. calculated from 1 — (harmonic mean of 5 f,-,-)/(arithmetic mean of jr,,). 
































318 


An exact test for the equality of variances 


Summary 

An exact test has been put forward for the equality of variances and covariances in any 
number of 1- or 2-variate normal populations; the test is also exact for two 3- or 4-variate 
populations; hut is restricted in application to equal sample sizes n from the k populations 
where n exceeds ph, p being the number of variates. The moments of the criterion are avail- 
able for k p-variate populations where the statistic used is equivalent to that employed by 
Wilks (1935) to test the independence of two groups of variates (of sizes p andp(/c-''l)), and 
has the same distribution. In the univariate case the power of the test is known as a function 
of one parameter. Comparison with the criterion has akeady been made when p ~l 
and jfc = 2, the tests being practically the same, and an example worked out of the use of the 
test when p = 1. 

Our thanks are due to E. 0. Eieller for drawing our attention to the papers by Morgan and 
Pitman and suggesting that the test given there for the equality of two variances might be 
extended to more than two; also to Prof. E. S. Pearson for pointing out the need of certain 
explanatory additions. 

The work described above has been carried out as part of the research programme of the 
National Physical Laboratory, and this' paper is published by permission of the Director 
of the Laboratory. 


REFERENCES 

Aitken, a. C. (1939). Determinants and Matrices. Oliver and Boyd, Edinburgh, 

Bishop, D. J, (1939). On a comprehensive test of the homogeneity of variances and covariances in 
multivariate problems, Riowe<nfca,31, 31. 

Fisher, R. A. (1928). The general sampUng distribution of the multiple correlation coefficient. Proc. 
Boy. SoG. A, 121, 654. 

Guaby, R. C. (1936). The ratio of the mean deviation to the standard deviation as a test of normality. 
Biometrikas 27, 3l0. 

Moroan, W. a, (1939). A test for the significance of the difference between the two variances in a 
sample frotn a normal bivariate population. Biometrika, 31, 13. 

Naib, U. S. (1939). The application of the moment function in the study of distribution laws in statistics. 
Biometrika, 30, 274. 

Kaxeb, P. P. N. (1936). An investigation into the application of Neyman and Pearson’s test, with 
tables of percentage limits. Statist. Res. Mem. 1, 38. 

Nbyman, J'. & Pearson, E. S. (1931). On the problem of k samples. Bull. int. Acad. Oracomc, S5rie A. 
p. 460. 

Pearson, E. S. & Neyman, J. (1930). On the problem of two samples. Bull. int. Acctd. Oracovie, 
S4rie A, p. 73. 

Pitman, E. J. G. (1939). A note on normal correlation. Biometrika, 31, 9, 

Thompson, 0. M. (1941). Tables of percentage points of the incomplete beta-function. Biometrika, 
32, 168. 

Waid, a. & Brookneb, R. J. (1941). On the distribution of Wilks’ statistic for testing the independence 
of several groups of variates. Ann. Math. Statist. 12, 137. 

Wilks, S. S..(1932). Certain generalizations in the analysis of variance, Biometrika, 24, 471. 

Wilks, S. S. (1936). On the independence of k sets of normally distributed statistical variables. 
Economeirika, 3, 309. 

Wilks, S. S. (1943), Mathematical Statistics. Princeton University Press. 



E, L, PliACIETT 


APflDK 



fum, using (11), (32) is at once otitainai for )f(2,l), I'orh2 tk M expression for 



[ 320 ] 


THE ESTIMATION FROM INDIVIDUAL RECORDS OF THE 
RELATIONSHIP BETWEEN DOSE AND QUANTAL RESPONSE 

By D. J, FINNEY 

Lecturer in the Design and Analysis of Scientific Experiment, University of Oxford 

1. Introduction 

A type of biometric problem frequently encountered by the statistician is that which 
requires the estimation and study of a relationship between dose and response. ‘Dose’ is 
here a general term indicating the magnitude of a stimulus applied to certain test subjects, 
and ‘ response’ is a measure of the effect which the stimulus produces on the subjects. When 
the test subjects are living matter, whether plants, animals or bacteria, pieces of tissue or 
single cells, the response to a specified dose is unlikely to be constant in repeated trials, and 
regression methods must be used in the estimation of the relationship. 

In some classes of data, the response is ‘ all-or-nothing ’ or quantal, and cannot be measured 
quantitatively. Ordinary regression methods are then no longer applicable; methods based 
on the transformation of the proportion of subjects showing the response at any dose level 
to the normal equivalent deviate (Gaddum, 1933), or to the probit (Bliss, 1934 a, b), however, 
have prove.d very powerful for simplifying the statistical analysis. In recent years, full 
accounts of the underlying theory of these transformations, and of their application, have 
been published by various authors (see, for example, Bliss, 1935a, 6; Finney, 1947, 1948). 
An additional difficulty sometimes found is that the intensity of the stimulus cannot be 
selected in advance of a test, but can only be measured after the test has taken place; only 
rarely will two or more subjects happen to receive exactly the same dose, and more usually 
the records consist of a hst of doses with, for each, a statement of whether a single subject 
receiving that dose responded or not.*" For example, in some methods for the testing of 
insecticidal potency, poison bait is offered to individual insects; the dose received by any 
insect cannot be specified in advance, and must instead be measured as the amount of poison 
ingested. 

Data from experiments of this kind do not give empirical values for the proportion of 
subjects responding at each dose level, except in the trivial sense that every dose shows either 
zero or 100 % responding. Nevertheless, as Bliss (1938) has pointed out, the probit method 
can still be applied to estimation of the dose-response relationship. He has given a numerical 
example, though without showing full details of the working, but has admitted that assess- 
ment of the error of estimation presents some theoretical difficulties (Finney, 1947, §43). 
An interesting example of experimental results requiring this type of analysis has recently 
been brought to the notice of the writer by Mr R. W. Gilliatt. These introduce an additional 
complication, since the dose is expressed in terms of two measurements, and a probit plane 
(Finney, 1943) or other bivariate regression function must therefore be estimated. An 
account of the analysis, with computational details, may help those who have encountered 
analogous problems in biological or other investigations. 

* When response does not involve death or serious alteration of the test subject, one subject may be 
used many times; the example discussed in this paper is an instance. The form ofthe data will be the same, 
though the interpretation may require that tolerance variation between and within subjects be dis- 
tinguished. 



321 


D. J. Finney 
2. The data 

Research in human physiology has demonstrated that, under carefully controlled experi- 
mental conditions, a transient reflex vaso-constriction in the skin of the digits may follow 
a single deep breath (Bolton, Carmichael & Stiirup, 1936). Gilliatt (1947) has found that the 
response depends in part on the volume of air taken in by the subject. Plethysmographic 
measurement of the volume changes in a finger was used to indicate the occurrence of a 
response, but assessment of the degree of vaso-constriction, in order to relate this to the 
inspiratory stimulus, was not practicable. Thus the records obtained for each test show only 



Pig. 1. Contours of dose-response surface for O-l, 0-26, 0-5, 0-76 and 0-9 frequency of response, 
estimated from three-parameter equation. O no vaso -constriction j • vaso-constriction. 

the volume of air inspired, the average rate of inspiration, and whether or not vaso-con- 
striction was produced. The above brief outline is sufficient for appreciation of the statistical 
problem, but a full account of the experimental procedure may be found in Gilliatt’s paper; 
the results discussed here are presented in his Fig. 5. 

The data, which Mr Gilliatt has kindly made available to the writer, were obtained from 
thirty-nine tests, in twenty of which vaso-constriction occurred. Tests were made on three 
different subjects, nine on D.W., eight on V.P.W., and twenty-two on the results of 

the tests, with the subjects in this order, are shown in Table 1. In Fig. 1 are shown the 
thirty -nine combinations of volume in litres (F) and rate of inspiration in litres per second 



Table 1 . Experimental data and details of calculations 



mffiSoSSSS§ooN"a«®®”§aot:SceopO(»-?iioo^otoi>i>cp^S® 

606666666666666666660600000000000000000 


s 

1-3068 

0- 6144 

1- 8954 
3-0498 
3-2650 
3-6953 
0-0000 

2- 3562 

0- 0824 
0-0000 
0-0000 

1- 8709 

2- 3436 

1- 7016 

2- 9304 
0-8261 

0- 0830 

3- 1480 
2-3436 

1- 5057 

0- 1338 

1- 6027 

2- 3562 
21720 
2-3324 
0-3354 
2-5123 

2- 4000 

3- 9564 
0-0824 
3-5952 

0- 2169 

2- 3436 

3- 4291 
3-2650 
3-2650 

2- 4000 

1- 8709 

3- 8040 

74-9914 

5-2478 


0-1656 

0-0832 

0-3780 

0-4012 

0-7550 

0-8624 

O-OQQO 

0-7749 

0-0352 

0-0000 

0-0000 

0-7632 

0-9324 

0-3288 

0-690S 

0-1331 

0-0120 

0-4600 

0-6489 

0-2646 

0-0780 

0-5311 

0-7119 

0-6780 

0-4250 

0-1534 

0-4366 

0-8192 

0-6174 

0-0240 

0-4928 

0-0432 

0-7938 

0-7102 

0-6500 

0-7600 

0-8192 

0-6784 

0-7260 

17-8375 

1-2483 

r 

0-2826 

0-1232 

0-2970 

0-2992 

0-4500 

0-4760 

Q-QQOO 

0-6552 

0-0380 

0-0000 

0-0000 

0-3922 

0-4914 

0-2760 

0-3872 

0-1496 

0-0151 

0-3720 

0-7749 

0-2646 

0-0360 

0-4606 

0-7119 

0-7080 

0-4080 

0-1014 

0-4662 

0-6272 

0-8064 

0-0480 

0-8008 

0-1233 

0-6552 

0-5512 

0-5400 

0-4500 

0-6272 

0-4664 

0-6660 

14-9980 

1-0495 


<OCOfiAh*«fi<l»fl-^<0«C'*«H«OC^a«Or-(Cir'C^r»COiM-^<MCOOOO>MiCOCOC>Jr-<«Mt-«WW5COT}< 

t^C'C^dbcb«bM«C^OiHCOe»5l>40t>Clbt^cbl>NCOcb€b€6c<J<6«COC^<b(NWcbcbcOWM<D 


s 

C0C0t-'rflO«0©«3’^OO«tf0r«-<t< OC»ii-«<Ot^<00’^«l'-'*«*^«DOiCOeOOO'^WJO 

HOC^tf0U5ift0©O99i0©W-^wO^<0«>4OT|<C0O«M««0«eO»0O«0»W»r5«CW«D 

6c*6666©66666666©66©666666©6666666©©6666 

Oi 


QCc09r»c099M'^9'^9009O'H905Cqi>©iHi-C^«»5OG<»OCq'«i4©00MfCpC0©CC^ 

Totals 

Mesma 

§ 

g- 

U 

-h-n- + 1 1 1 1 1 1 |-(--l--4-f-f|-t-| 1 1 l-hl-fl-fl-fl l-|--f-+-l l-f 


H 

SS2?Sr^^SS22SH2S?!435l*^'^'^“5w«DO«o«>tfo»o<»ooooooococoo*T*<©fMcoQOi-t 

Oi-lHrHt^H©rHOOOMpHrHFl|Ar-tpHAiHiHrHfHtM»HAA«HOOOoArHrHAr^rH>H 


H 

!ilA5222i2£!!liS!22'^^'^^^'^ww©©oortooooo©©oooeot»'Ti<'^ooooooorH 

99f^C»9Cpt>999^l|.tprH00«0l09WC'1903rHFHC^I>(M9B<)R^'sJ<C^O©oS©MrH 

rHHpHooooM©oo©6A<6AA©AA66AAA6A6AAAAMAA©oc)A 


■§l!a 

Em 0, 

OfHNrHm«Oi-(OOONrt<MmAArtrHA(NrHH,H,H.H,Hrt6666A(NNMiHM.H 


Volume 
in litres 
(F) 

eO«rHOOOOr-(OOOOOr-(OeqOTOi-<rlo6rHr1rt6A6illAc-<NtHMA666rH 



SvxCj^ Swx^x^ Swx^ Svix^y Swx^y Swy’^ 

16-235606 18-338532 22-783383 79-738989 94-125032 433-686326 

16-741078 18-721261 22-265669 7a7068M 93-608054 393-541643 

0-49452S n —0-3 82720 0-517714 1-032130 0.-516978 40-044683 



D. J. Finney 


323 


(B), together with indications of whether or not the sub j ect responded under these conditions. 
Inspection of Fig. 1 shows that, in general, when both V and B were small no response 
occurred, when either was large (unless the other was very small) the response occurred, and 
in an intermediate region the proportion of responses increased as either V ov B increased. 
There was no sharply defined threshold separating combinations of V and B giving the 
response from those giving no response ; instead, there appeared to be a probabihty of response 
ranging from practical certainty under some conditions to zero under others. 

As an aid to fuller understanding of the influence of breathing on vaso-constriction, ex- 
amination of the relationship between F, B, and the probability of response seemed desirable. 
Since so few observations were available for each subject, the data were unhkely to be sufli- 
cient to show differences between subjects; this point is discussed later, but in the main 
analysis the distinction between subjects is ignored. For any form of response assessment, 
the testing of one subject many times must introduce a danger that the result of one test 
will be affected not only by its own stimulus but by preceding stimuli and by the effects they 
produced. In this investigation, each subject was given a number of preliminary tests until 
he appeared to have settled into the routine. The observations recorded in Table 1 were 
obtained after these preliminary trials; they are tabulated in the order of testing, and show 
no indication of effects of previous history, but clearly such effects would have to be very 
pronounced if they were to be detectable on this amount of data. 

3. Methop of analysis 

Preliminary examination of the data suggested that the occurrence of a response was largely 
determined by the magnitude of YR, the product of volume and rate, curves on which the 
probability of response has a constant value being approximately hyperbolae of the form 

VB = constant. (1) 

A little consideration shows that an equation of this type is more reasonable than an equation 
linear in V and B, though the data are almost certainly inadequate for discriminating between 
many alternative types of relationship that might be postulated. A system of curves similar 
to, but rather more general than, equation (1), namely, 

yfiiBft = constant, (2) 

was selected for trial; this equation may alternatively be regarded as representing a series 
of parallel linear relationships 

log F /?2 log iZ = constant (2a) 

between the logarithms of volume and rate for a fixed probability of response. 

A specified combination of F and B will not necessarily always give the same result 
(response or no response) with a subject, for, even though the subject is unaltered, minor 
uncontrolled variations in his environment may affect his susceptibility to the appHed 
stimulus. For a particular value of F, the threshold value of B (the value which under the 
conditions prevailing at any instant would be just suf&cient to produce a response) will 
have a frequency distribution; similarly, for a particular B, there will be a frequency dis- 
tribution of threshold values of F. If these distributions may be taken as normal in log F 
and log B, and, for simplicity, they are supposed to be such that the mean of either logarithm 
is linearly related to the selected value of the other, then the probabihty of response will be 
determined by an expression of the form 

yffilogF-hyJzlogJB, 



324 Individual records of dose and quantal response 

und the threshold values of this quantity will be normally distributed. If Xj^ and x^ are 
written for log (lOF) and log(10i?) respectively (the factor of 10 is introduced in order to 
make and always positive), this statement enables the probability of response, P, to 
be expressed as 


'a 






( 3 ) 


where a, and are parameters to be estimated from the data. The estimation may be 
regarded as the fitting of a probit regression plane, for Y, the probit of P being given by 

r = 5+a+fiX^+fiX.^. (4) 


Substitution of the value of 7 corresponding to a specified probability gives the required 
linear relationship, equation (2a), between x^ and x^ for that probability, from which the 
estimated curves of constant probability, equation (2), may easily be derived. 

The procedure for fitting a probit plane has been described elsewhere (Finney, 1943, 1947, 
§31), and its chief features need no alteration for application to individual records. Pro- 
viding that a first approximation to the equation can be guessed, repeated cycles of com- 
putation will give values for the parameters which approach more and more closely to the 
maximum likelihood estimates. Care in the choice of the first approximation will reduce 
the number of cycles needed; a poor choice will delay the convergence, though it will not 
affect the ultimate result. Since only a single observation is available for each combination 
of x^ and x^, every working pro bit is either a maximum or minimum value, according to 
whether or not the response occurs. When there is only one dose factor, in the fitting of a 
probit regression line to records of individuals, grouping of doses and treatment of the 
observations in a group as if they related to an average dose may reduce the labour of the 
early computing cycles, but, since it will tend to give an underestimate of the regression 
coefficient, the final cycle may need to use the detailed records. Bliss (1938) has given -in 
example illustrating grouping of this kind. Grouping is less easily applied, however, when two 
or more dose factors have to be used, and, for the data under discussion, the individual records 
were used throughout except in the formation of the first approximation. 

In the standard form of probit analysis, with moderately large numbers of observations 
at each level of dose, a is usually computed for testing the significance of discrepancies 
between the data and the fitted equation; this y® is numerically the same as would be obtained 
by calculation from expected and observed -numbers of responses and non-responses for 
each dose. If there are few observations in any dose group, the expected number of responses 
or of non-responses (or of both) is likely to be small, and, as is well known, may then fail 
to follow the sampling distribution tabulated for that statistic. Data of the type under 
discussion here are extreme examples of this situation, the number of observations for each 
dose being reduced to unity, so that any disturbance of the distribution is likely to be 
encountered in its most acute form. No complete theoretical investigation of this matter 
has yet been made, but the practical implications are discussed more fully in § 0. 

On the assumption that the estimate of equation (4) is an adequate representation of the 
data, lines of constant response probability may be obtained for any specified probability; 
these may be plotted according to equation (2) on a F, J? scale. Standard statistical processes 
also enable fiducial limits to be assigned to the position of any of these curves. The difficulty 
of dealing with the estimation of error for individual records, and the inadequacy of the 
data for any sensitive test of whether equation (2) is a satisfactory representation of the 



D . J. Finney 325 

system of curves, throw doubts on the exact interpretation of these fiducial limits. Never- 
theless, they give some idea of the confidence that can be attached to the estimated curves, 
at least for moderate values of V and i?; for extremes of either measurement, far more 
extensive data would be needed before much faith could be placed in the fitted equation. 

4. Computations nor estimatino the three-parameter equation 
In this and the two succeeding sections, the computations for Gilliatt’s data will be described 
in detail. The first five columns of Table 1 show the thirty-nine pairs of values of V and B 
which occurred in the experiments, followed by the corresponding values of and x^, 
together with a statement of whether or not the subject responded. Before the probit com- 
putations could be initiated, a first approximation to equation (4) was needed; this was 
obtained with the aid of the suggestion, from the plotting of the data shown in Fig. 1, that 
the constant probabihty curves were approximately the hyperbolae of equation (1), or 
alternatively ^ ^ constant. 

As Bliss (1938) has pointed out, there is no objection to the use of overlapping groups in the 
formation of the finst approximation. The data were therefore grouped according to the 
value of (a?! -f x^), as shown below, and the proportion of responses in each group was obtained 
from Table 1 ; 



Reaponses 

. 

Proportion 

(P) 

Probit 

of 

First 

approximation 

HHMH 

Ojl 

0-00 

_ 

3-3 


0/7 

0-00 

— 

3-6 


2/7 


4-4 

3-9 

1-8-2-2 

2/9 


4-2 

4-2 

l-9-2'3 

3/14 

0-21 

4-2 

4-6 

2-0-2-4 

8/19 


4-8 

4-8 

2-l-2'6 

13/24 


51 

5-1 

2-2-2'6 

17/26 


5-6 

6-4 

2-3-2-7 

16/17 

0-94 

6-6 

6-7 

2-4-2-8 

12/12 

1-00 


6’0 


Each proportion was regarded as an estimate for the median value of (x^ -f x^) in the group, 
i.e. 1-7, 1-8, 1'9, ..., and its probit was read from one of the standard tables (Finney, 1947, 
Table I; Fisher & Yates, 1947, Table IX). As may be seen above, these probits were fairly 
well fitted by the guessed equation 

r = -l-8+3(a;j-t-a;a), (5) 

which was therefore used as a first approximation to equation (4). 

A first set of expected probits was calculated from equation (5), and inserted as F in an 
earlier version of Table 1 . A cycle of routine probit calculations, just as described in the next 
two paragraphs, then led to an improved approximation to the required estimate, on which 
a second cycle of improvement was based. The figures shown in Table 1 relate to the fourth 
of these cycles, based upon the approximation 

y = -9-127-F6-666a;i-t-5-906a;a (6) 

from the third cycle. Equation (6) is very different from equation (6), suggesting that more 
care might have been given to the selection of a first approximation; that the grouping 











326 Individual records of dose and quantal response 

adopted would lead to underestimation of the regression coefficients was expected, but 
insufficient allowance for this was made. Of course the ‘improvement ’ in the approximations 
refers to their approach to the solution of the maximum likelihood ecjuations, and is not 
necessarily always an approach to the true relationship. 

The column of expected probits, Y, in Table 1 was calculated by substitution of pairs of 
values Xi, in equation (6); one decimal place here is quite sufficient. The weighting coeffi- 
cient, w, for each observation was then read from tables (Rnney, 1947, Table II; Ksher & 
Yates, 1947, Table XI) and entered in its column. The working probit, y, takes a maximum 
value for every observation giving a response and a minimum value for every observation 
giving no response, since these give empirical rates of 100 % and zero respectively; values 
of y were read directly from Finney’s table (1947, Table III; or, less simply for the minimum 
values, from Fisher & Yates, 1947, Table XI). The numbers of decimal places shown for the 
entries, in Table 1 are sufficient for data of this type; indeed possibly one decimal for w and 
for y would be enough. Columns wxi, wx^, and toy were then filled, and the weighted sums of 
squares and products of deviations, required for the calculation of the regression of y on 
and x^, were completed at the bottom of the table. 

The equations giving the estimates of the regression coefficients, and b^, are 

0-4945286i- 0-38272962 = 1-032130, 

- 0-38272961-4 0-61771462 = 0-516978. 

Later calculations use the variances and covariance of b^ and 62; the equations were therefore 
solved by first obtaining the matrix inverse to that formed by the coefficients of b^ and 6^ 
(Finney, 1943, 1947, §31; Fisher, 1946, §29). This matrix is 

«i 2\ /4’726144 3-493883\ 

W «22/ 13-493883 4-614482^’ > 

the accuracy of the data is insufficient to need the number of decimal figures shown here, 

but their retention assists the checking and maintains the internal consistency of the 

analysis. Now 61 = 1-032130«>ii+0-516978di2 

= 6-68426, 

and similarly = 6- 94003. 

The estimate of equation (4) is then 

Y = y + 6i(a:i - x-,) + b^ix^ - x^) 

or r = -9-182 + 6-6843x1 + 5-9400x2, (8) 

a result which differs little from equation (6) and may be regarded as a sufficiently close 
approximation to the maximum likelihood estimate. Since 

62/61 = 0-889, (9) 

equation (8) may be transformed to give 

F7J0-889 _ constant (10) 

as the relationship estimated to exist between 7 and B for a specified probability; the value 
of the constant can be obtained by substitution of the probit of the probability in equation 
(8), a process which gives MO, 1-36, 1-71, 2-16 and 2-66 for probabilities of 10, 26, 50, 75 
and 90 .% respectively. Typical contours have been drawn in Fig. 1 so as to indicate the form 
of the relationship. 



D. J. Finney 


327 


6. Goodness op pit 

Wlien probit analysis is applied to data containing many observations in each dose group, 
the weighted sum of squares of deviations between the empirical probits and the predictions 
from the fitted equations is a with degrees of freedom equal to the number of dose groups 
reduced by the number of fitted parameters. If is written for the weighted sum of products 
of deviations of variates u and v, application of this method here would give 

= 40-045 - 6-6843 X 1-0321 -5-9400 X 0-5170 

= 30-08. (11) 

When the dose groups are small, however, the so calculated cannot be trusted as an in- 
dicator of the significance of deviations from the fitted equation, and it is presumably most 
unreliable when each group is reduced to a single observation. Apart from slight discrepancies 
caused by imperfect approximation to the maximum likelihood solution, the ■)^ in equation 
(11) is algebraicallj'' identical with that which would be derived, by the usual form of cal- 
culations, from comparison of observed numbers responding and not responding in each 
group with expectations computed from the fitted equation. As is well known from the 
study of contingency tables, when the expectations in some classes are small the sampling 
distribution of such a y® may be very different from that shown in the standard tables 
(Finney, 1947, Table VI ; Fisher & Yates, 1947, Table IV) ; with data from individual records, 
no class can have an expectation greater than unity, and for many the expectation will be 
very much less, so that the discrepancy from the tabulated distribution is likely to be 
serious. 

The general effect of small expectations on the random sampling distribution of appears 
to be that the mean value remains about equal to the number of degrees of freedom, but that 
tlie variance in repeated sampling is increased. Consequently, samples from a population 
according with the null hypothesis are likely to show an excess of very high and very low 
values, as judged by the tables of y®. Thus there is little danger that significant evidence of 
deviations from expectation will be overlooked in an uncritical application of the test, though 
apparently significant values of x^ need to be examined with care before they are regarded 
as evidence sufiicient to justify rejection of the null hypothesis. Low values, as in GiUiatt’s 
data 30 with 36 degrees of freedom, need cause little alarm, for they clearly indicate no 
serious deviation from expectation. High values may in the first instance be compared 
with the standard tables of the distribution; if they fall beyond the significance level, a 
closer examination should be made before judging the null hypothesis to be untenable, for 
the apparent significance may be due to large contributions from one or two aberrant points. 
Gilliatt’s data provide an illustration of this. The expected probits for each pair of values 
of *1 and X 2 in Table 1 have been calculated from equation (8), and the probabilities, 
P (= 1 — Q), corresponding to these have been entered in the last column of the table; 
P is then the expectation of the number of responses for each dose. The obtained from the 
observed and expected numbers in seventy-eight classes is easily seen to be the sum of QjP 
for all doses giving a response, plus PjQ for all giving no response. Inspection of the column 
for P shows small contributions to x® everywhere, except for two instances of responses with 
probabilities of only 0-098 and 0-128, contributing 9 and 7 respectively; clearly the occurrence 
of these two responses as the most extreme events in thirty -nine trials need not be regarded 

Biometrika 34 


22 



328 


Individual records of dose and quantal response 

as serious evidence against the null hypothesis. The result of calculating hy this more 
laborious process is a total of SO- 3, which agrees closely with that already given in equa- 
tion (11). 

One method of modifying a yf test so as to remove its extreme sensitivity to deviations 
from small expectations is to combine expected and observed frequencies over several 
adjacent groups, so as to obtain groups with larger expectations; the number of degrees of 
freedom is then taken as the number of remaining groups less the number of fitted para- 
meters. Of course the groups must be chosen objectively, and without regard to the agree- 
ment between the frequencies. The statistic still will not follow the distribution exactly, 
but the approximation should be fairly satisfactory under the usual restriction that the 
groups be so chosen that none of the expected firequencies is small. This procedure often has 
to he adopted in probit analysis because of small expectations at very low or very high doses 
(T’inney, 1947, §18). 'With individual records, however, only very extensive grouping will 
give expectations sufficiently large for the yf test to be trusted; the reduction of a large y^ 
to a value below the significance level might then appear indicative of an insensitive test 
rather than of absence of serious discrepancies.* 

Probably no completely satisfactory solution of the difficulty is to be expected. Individual 
records usually arise from experimental work in which the obtaining of large numbers of 
observations presents considerable difficulty. Often the whole series will consist of less than 
fifty observations, and, unless previous information enables the range of doses to be chosen 
satisfactorily, many of the observations will be made at doses for which response is either 
almost certain or almost impossible. Even if the individual dose-toleranOes could be measured 
directly, a test of normality of their distribution (which is what the test attempts to 
provide) could not be very sensitive when based on only fifty measurements; if, instead, 
only quantal data are available, indicating merely whether a dose is below or above the 
tolerance value, a sensitive normality test is still less likely to exist (Einney, 1947, § 43). 

GiUiatt’s data, a series of only thirty-nine observations, provide an extreme instance of 
the difficulty of formulating a sensitive test of goodness of fit. Nevertheless, an attempt has 
been made to examine the discrepancies between the observations and the null- hypothesis 
expressed by equation (4). In Table 2 are compared the observed and expected frequencies 
when the data are grouped according to the value of Fi?®'®®®. This is equivalent to a grouping 
based on the value of Y, the expected probit in equation (8), and, as this quantity had been 
evaluated for each observation in order to give P, it was used in the construction of Table 2. 
Since three parameters have been estimated from the data, four groups is the least number 
for giving a y® test. The hmits of the groups were chosen so as to give similar numbers of 
observations in each. Inspection of Table 2 shows that the groups are still too small for a 
test to be trusted, thus suggesting that the data are inadequate for any useful test of goodness 
of fit to be made. The only anomaly in Table 2 is the occurrence of two responses where the 
expectation is 0-3, and this is clearly insufficient to cause much worry. 

The inadequacy of the data for detecting any differences in sensitivity between the three 
subjects may be seen from Table 3. The first nine entries in Table 1 relate to D.W. , and sum- 

* In his discussion of the analysis of individual records, Bliss (1938) suggests adjustment of the y^ 
test, not hy altering the calculation of the statistic but hy reducing the number of degrees of freedom 
allotted to it; he gives an empirical rule for the reduction, based npon the expectations in terminal dose 
groups. This method, however, not only" lacks any theoretical basis, but seems liable to have an effect 
opposite to that which is needed; it will attribute aignifioanoe to high values of even more readily 
than will the unadjusted test. 



D. J. Finney 


329 


mation of the values of P gives the expected number of responses for this subject; similarly 
the next eight and the last twenty-two entries give the numbers for V.P.W. and S.J.S. 
respectively. Inspection of Table 1 shows that the tests on each subject were fairly widely 
distributed over the range of values of and x^. Table 3 shows excellent agreement between 
totals of observed and expected responses for each subject, thus suggesting that any in- 
dividual differences that exist are small by comparison with the variation in sensitivity of 
the same subject in different tests. 


Table 2. Comparison of observed and expected frequencies of response 


Range of Y 

Frequencies of results 

Observed 

— 4 - 

Total 

Expected 

+ 

-4 

s 

2 

10 

9-72 

0-28 

4^6 

6 

0 

6 

3-92 

2-08 

6-6 

6 

8 

13 

4-26 

8-74 

6- 

0 

10 

10 

0-69 

9-41 

Total 

19 

20 

39 

18-49 

20-61 


Table 3. Comparison of subjects 


iSubject 

Frequencies of results 

Observed 

+ 

Total 

Expected 

-b 

D.W. 

3 

6 

9 

4-0 

6-0 

V.P.W. 

4 

4 

8 

3-6 

4-6 

S.J.S. 

12 

10 

22 

11-0 

11-0 

Total 

19 

20 

39 

18-6 

20-6 


6. Limits of error 

The variances of 6^ and b^ and the covariance between them are respectively v-y^, and 
as defined in equation (7). Hence the variance of Y, the expected probit corresponding to 
any pair of values x^, x^, is 

7(7) + + (12) 

where 8w is the sum of the w column in Table 1 . All these variances are derived fi:om binomial 
probability distributions. In the usual form of probit analysis, with a batch of subjects at 
each dose, the precision of the estimated relationship between dose and response is discussed 
as though the variation were normal, an assumption which is justifiable on account of the 
large numbers of individuals involved. Here, with only thirty-nine observations in aU, the 


22-2 





330 Ifidividucd records of dose ctnd quo/ntcil response 

assumption is less safe, but may be adopted for lack of any more trustworthy method of 
dealing with the data. It is unlikely to be seriously misleading, except possibly for extreme 
levels of the response probability, P. 

Equation (12) may now be used in the assignment of fiducial limits to any one of the curves 
of equal probability given by equation (10). For suppose that t is the normal deviate corre- 
sponding to the significance level to be used in defining the fiducial limits, and that Iq is the 
probit of a probability P#. Then for any values of*!, for which 



Fig. 2. Fiducial limits (6 % probability) to 0-5 frequency contour of Fig. 1. 
O no vaso-constrictionj ® vaso-constriotion. 


where Y is determined from equation (8), the expected probit differs significantly from Y^, 
and for values of ai], , which reverse the inequality the difference is not significant. Therefore 


the equation 


{Y-T,f^tWiY) 


(13) 


gives the hmiting values of (a:;^, x^) for which the null hypothesis that the true expected probit 
is To is not untenable in the light of the data; in other words, equation (13) defines curves in 
the [x-y, x^) plane which are fiducial limits to the estimated locus of points having a constant 
response probability P^. These curves are clearly hyperbolae. In Figs. 2 and 3, the 6 % 
fiducial limit curves {t = 1-960) for = 0-6 and Pq = 0-9 respectively have been plotted in 



D. J. Finney 331 

the (V 1 -B) plane; details of the calculation need not be given here, but Fig. 2, for example, 
is derived from the equation 

(14-182- 6>6843a;i-5-9400a;2)2 = 3-841 j^^^ + 4-7261(a;i-l-()495)2 

+ 6-9878(a;i- 1-0496) {x^- 1-2483) + 4-5145(a!2~ l-2483)aj . 

The pairs of curves are like hyperbolae in form. That for P^- 0-5 defines a band on either 
.side of the estimated relationship which is quite narrow for moderate values of V and M 



Fig. 3. Fiducial limits (6 % probability) to 0-9 frequency contour of Fig. 1. 

O no vaso-eonstriction; • vaso-constriction. 

though naturally it widens considerably at the extremes. That for Pq = 0-9, as might be 
expected from general consideration of the problem, allows much greater uncertainty on the 
side of high values of V and J?; similarly, for = 0-1, that band would be relatively wider 
on the side of low values of V or E. 

The curves shown in Figs. 1, 2 and 3 may be regarded as plane sections, for selected values 
of Y, of a three-dimensional diagram relating F to F and R. In terms of instead of 
V, B, this diagram is the three-dimensional analogue of the familiar diagram showing a 
regression line with hyperbolic curves indicating limits of error on either side; the line 



332 Individual records of dose and quantal response 

generalizes to a plane, and the limits are now defined by two sheets of a hyperboloid, one 
above and one below the plane. 

The theoretical basis of the curves illustrated in Figs. 2 and 3 is perhaps insecure, but 
undoubtedly they give a useful indication of the dependence of the probability of response 
on V and E and of the reliabihty of the estimation of this relationship. Much as an experi- 
menter might wish for a. more precise assessment of the effects of V and R, experience sug- 
gests that results such as those obtained here are as good as can be expected from a total of 
thirty-nine quantal observations. 

7. The two-parametbe equation 

In § 3, the equation VE = constant (1) 

was suggested as an expression of the curves of constant response probability, but the more 
complex equation (2) was adopted for use in §§4^6. There are no theoretical reasons for 
believing that equation (1) represents the true form of the relationship, and the more general 
form was chosen in order that the complete calculations might be illustrated. The values of 
and obtained, however, do not differ very greatly by comparison with the standard 
error of their difference; in fact 

F(6i-6a) = -1-^22 

= 2-263, 

and therefore fci— = 0-744 + 1-501. 

In the absence of any significant difference between the regression coefficients, the common 
scientific procedure of preferring the simpler hypothesis (Occam’s Razor) suggests that 
equation (4) might be replaced by 

Y = oi,+fi(xi+x^). (14) 

For the estimation of equation (14), the computations are similar to, but shorter than, those 
of § 4, since {x-^ -f x^) may be replaced by a single variate, x, and a simple regression calculated; 
the calculations in §4 were used iSc give a first set of expected probits, ftom which was 
derived the estimate y = - 9-476 -f6-4067(a;i-|-a;2). (16) 

Only two parameters have been estimated from the data, and calculation as for equation 

= 28-76. 

The difference between the two y® values may be taken as a further criterion of whether or 
not the extra parameter is needed, closely related to the test of significance of {bi — b^)', 

is not significant, though again the validity of the test is in doubt. 

Substitution of the probit of a specified probability in equation ( 16) gives the value of the 
constant in equation (1). For the 60% response probability, for example, the constant is 
1-82; over the range of values tested, the curves 

P7J0-889 1.71 _ l.g2 

differ only slightly. Similarly, fiducial limits to (aJi-taJg) may be calculated, for any Fq, as 
upper and lower values of the product Fi?. No special interest attaches to these calculations; 
the novelties due to the individual records are exactly as for the three-parameter equation 
discussed in earlier sections, and otherwise the method is entirely that of ordinary probit 
analysis (Finney, 1947, Chapter 4). For comparison with the three -parameter equation, 
diagrams similar to Figs. 2 and 3 may be prepared; both the constant probability curves and 



D. J. Finney 


333 

the fiducial limits are then true hyperbolae. Mg. 4 shows the results for a 50 % response 
probability, and is to be compared with Mg. 2. The constant probability curve in Mg. 4 
differs little from that in Fig. 2, though naturally the difference increases for large values of 
V or H where the curves are less well determined. For moderate values of V and F, the fiducial 



Fig. 4. Contour of dose-response surface for 0-6 frequency of response, estimated from two-parameter 
equation, and its 6 % fiducial limits (compare Fig. 2). O no vaso-constriotion; • vaso-constriction. 

limit curves are practically the same as the corresponding curves in Fig. 2, but for more 
extreme values they lie much closer to the curve of constant probability; since the data 
show no significant difference between and b^, it is to be expected that a more precisely 
estimated relationship between stimulus and response will be obtained if an assumption that 
A = A made, so that the information on the two regression coefficients can be combined, 
and this shows itself by narrowing the zone of error for the constant probability curve. 

8. Summary 

The method of probit analysis has been developed to assist the study of the relationship 
between the magnitude of a stimulus and the proportion of tests in which a particular 
quantal response to that stimulus appears. In some research problems, the stimulus cannot 
be controlled sufficiently to make possible the administration of a specified magnitude, 
though the stimulus actually received by any one subject can later be measured. It will 
then seldom happen that two subjects receive exactly the same ‘dose’, and the data for 



334 


Individual records of dose and quantal response 

statistical analysis will generally consist of a series of doses with, for each, a statement of 
whether or not a single subject showed the characteristic response. 

Even for data of this type, the probit transformation can aid the estimation of the relation- 
ship between dose and the probability of response. The calculations leading to the estimate 
are more tedious than is usual in probit analysis, because of slow convergence from a pro- 
visional equation to the final form, but follow the usual pattern. The validity of the test 
of goodness of fit (in reality a test for the normality of distribution of individual tolerances) 
must be doubted, however, since the disturbance due to small class numbers will be en- 
countered in its most extreme form. Extensive grouping of results for adjacent doses will 
provide a test less open to objection, though this will generally be insensitive to all birt the 
grossest deviations from normality; indeed, no valid sensitive test is to be expected with 
individual records unless these are very numerous. 

In this paper, the calculations have been illustrated on data relating to a reflex vaso- 
constriction which sometimes occurs in the skin of the digits of human subjects after a single 
deep breath. The relationship between the occurrence of this response and two dose factors, 
the volume and the rate of inspiration, has been estimated for the combined records from 
three subjects; inclusion of two dose factors complicates the analysis, since a bi-variatc 
regression equation must be fitted, but does not affect the underlying theory. The test 
has been discussed at length, though there is no indication of non-normality or of hetero- 
geneity of the data. The reliability with which the dependence of the probability of response 
on the dose factors is estimated has also been examined, and curves bounding fiducial 
regions, within which the true probability contours may confidently be asserted to lie, have 
been determined. This method of representing the limits of error is applicable to other forms 
of probit analysis involving two dose factors and is not restricted to individual records, 
though it has not previously been described. 

I am indebted to Mr R, W. GiUiatt, of the. Department of Physiology, both for permission 
to make use of his data in an illustration of the statistical methods of my paper and for 
assistance in describing his experimental procedure. My thanks are due also to Miss M. Callow, 
who prepared Eigs. 1-4. 

REFERENCES 

Bliss, C. I. (1934o). The method of prohits. Science, 79, 38-9. 

Bliss, C. I. (19346), The method of prohits — a correction. Science, 79, 409-10. 

Bliss, C. I. (1935a). The calculation of the dosage-mortality curve. Ann. Appl. Biol. 22, 134-07. 
Bliss, C. I. (19386). The comparison of dosage-mortality data. Ann, Appl. Biol. 22, 307-33. 

Bliss, C. I. (1038). The determination of dosage-mortality curves from small numbers. Quart. J. 
Pharm. 1 1, 192-216. 

Bolton, B., Cabmiohael, E. A, & Stdiujp, G-. (1936). Vaso-conatriction following deep inspiration. 
J. Physiol. 86, 83-94. 

Finney, D. J. (1943). The statistical treatment of toxicological data relating to more than one dosage 
factor. Ann. Appl. Biol. 30, 71-9. 

Finney, D.J. (1947). Prohit Analysis ; A Statistical Treatment of the Sigmoid Besponse Curve. Cam- 
bridge: University Press. 

Finney, D. (1948). The principles of biological assay. J.B. statist. Soc. Suppl. 9, 46-91. 

Fisher, R. A. (1946). Statistical Methods for Besearch Worker's, lOtli ed. Edinburgh: Oliver and Boyd. 
Fisher, R. A. & Yates, F. (1947). Statistical Tables for Biological, Agricultural and Medical Besearch, 
3rd ed. Edinburgh : Oliver and Boyd. 

Gaddum, j. H. (1933). Reports on biological standards. III. Methods of biological assay depending 
on a quantal response. Spec. Bep. Ser. Med. Bes. Goun., Land., no. 183. 

Gilliatt.R, W. (1947). Vaso-oonstrictionin the finger following deep inspiration. J.P/ 22 /,sioZ. (in the Press). 



[ 336 ] 


A POWER FUNCTION FOR TESTS OF RANDOMNESS 
IN A SEQUENCE OP ALTERNATIVES 


By F. N. DAVID 


1. During recent years attention has been focused on what might be called the ‘group’ 
test for randomness in a sequence of alternatives. Thus, if E denote the happening of an 
event, and E its negation, the number of alternations of E and J in a sequence supposedly 
random has been chosen as a test criterion. This test has been put to different uses by 
W. L. Stevens (1939), A. Wald & J. Wolfowitz (1940) and F. N. David (1947). It seems worth 
while therefore to enquire what is the power of this test against a set of speciflcally defined 
alternate hypotheses. The hypothesis to be tested wiU be that there is randomness within 
the sequence, with the alternate hypothesis that if there is no randomness then there is 
dependence of the type found in a simple Markoff chain. The same procedure will hold good 
for dependence of the t 3 q)es found in more complex chains although in these cases the 
enumeration is a little troublesome. 


2. If there is a sequence of dependent events 

El, E^, Es, ..., E.^, 

then it is an elementary proposition of the probability calculus that 

P{EiE,E,...E„.iE,,} = P{Ei}P{E,\Ei}P{E,\EiE,}...P{E,\E,E,...E,_i}. 

If the events are independent, then 

P{EiE,E, ...E,_iE,} = P{Ei} P{E,] P{E,}... P{E,}. 

This relation will be the basis of the hypothesis to be tested. If there is dependence as 
in a simple Markoff chain, then mathematically each event will be dependent on the event 
immediately preceding it, but will be independent of any of the other events. In this case 
we shall have 

P{E,E^E , . . . E„_A} = P{Ei} P{E^ i P{E^ \E.^- -PR 1 

This relation will be the basis of Hi, the hypothesis alternate to H^. 


3. For the hypothesis, Aq. lef the probability that an event E will occur in a single trial 
be p, and let the probability of E (the negation of ^7), be q, where p + g == 1 • The probability 
of obtaining any given sequence of ViE’s, and r^E’a will be 


The number of ways in which iS’s and r^E's may be arranged to form and 2i + 1 sets of 
E’s and J’s alternately is 




2(jq-l)!(r2-l)! 


and /2i+i=/2,x 




(t-l)Ut-l)!(ri-t)l(r2-t)l “ “ 2t 

Writing ifc = 2i or 2i + 1 as desired, the probability of obtaining a sequence of Ti E’s and r^E’s 
arranged in & sets is h 


AIU 


AlU 


h may take values 2, 3, . . . , 2r2, if jq = r^, and values 2,3,..., -f 1 if > r^. 



336 A power function for tests of randomness in a sequence of alternatives 

4. Following the orthodox procedure, in order to test the hypothesis, Fq, it is necessary 
to find two numbers and \ such that 

P{k<k^\Ho}<:ie, I Fq} ^ |e , 

and therefore P{ki < fc < iig} ^ 1 — e, 

where e is a number arbitrarily at choice. If an observed number of sets, say k', falls outside 
the limits hj and k^ then the hypothesis Hq will be rejected in favour of some alternate 
hypothesis, Hi- Alternately if is not true, but is, then 

1 — P{ki < i; < Aig I FJ 

will be the power of the test in the sense of the word as used by Neyman & Pearson. Whether 
ki or fcg is chosen to judge the significance of an observed k' will depend on which departure 
from randomness it is most important not to overlook. If the alternate hypothesis is that 
there is positive dependence in the chain, i.e. that E having occurred in the sth trial it is 
more hkely to occur in the (s + l)st trial, then k^ would be chosen. Such a situation was 
envisaged in a proposed smooth test to supplement the criterion (David, 1 947). If, however, 
the alternate hypothesis is that there is negative dependence, i.e. that E having occurred in 
the sth trial, it is less likely to occur in the (s+ l)st trial, then k^ would be the appropriate 
criterion. If it is immaterial whether the departure from randomness is positive or negative 
dependence, then both and may be used. 

6. We now consider the alternate hypothesis, F^. Write Eg for the occurrence of the 
event E in the ath trial and Eg for its negation. Let 

P{E,} - P, _ P{E,} =Q, P+Q==l and P^Q, 

P{Eg\Eg_,} = p„ P{E_g\E_g_,} = q„ 

P{Es\Eg^,} = p„ P{Eg\Eg_,} = q,. 

Thus Pi and qg are probabilities of no change and and q^ probabilities of a change. If the 
events are independent then 

Pi = P2 = -P and == 2a = Q- 


6. In calculating the probability of obtaining any given sequence, what will matter wdl 
be the number of changes from E to E and back again. Let/,(ri) be the number of ways in 
which riF’s.oan be arranged in t groups, i.e. let 


fti^i) = 


(n-l)! 


If there are 2t groups in a sequence of E’s and r^E’s, the number of ways of obtaining such 
a sequence will be 

if the sequence starts with E or with E. The probability of obtaining any given sequence of 
Ti E’s and r^E’s of 2t groups wiU be 


or 




This follows from the fact that a sequence of 2t groups beginning with E will imply t changes 
from E toE and t — 1 changes from E to E. The changes are reversed in number if the sequence 
starts with E. For 2t + 1 groups the number of ways of obtaining the sequence will be 

ft-Mftird or fi(rf)ft+^{r^) 

according as the sequence begins with E or E. The respective probabilities will be 
Ppipl^-‘-^q{ql^-‘ and Qq{ql^~^~^plp{>--f. 



S'. N. David 


337 


The probability therefore of obtaining a sequence of r^E’s and r^E's in 2t groups will be 
therefore, under hypothesis 


P{2t\r^rA} 


\1»i32/ 


IP2 3iJ 

1 

ra i 

sl 

MiV 

fiirx) 

£ 

H-fi) 

) + ^/(+i(u) ftirz) + f /((U) ft+iir-z)] 
Pi U2 J 


The probability of obtaining r^J’s and r^E'& in 24+ 1 groups will be similarly 


P{2t+l\nr^H^} = 



MiV 

Pl^j 

[£/m(U)/,(^‘2) + |/,(U)/m(^2)] 


’■» i 

s 

t-l’ 

fP22l\ 

‘[//(^l)//(ra)l 


—Mft+li'Tn) 
^2 J 


7 . So far no mention has been made of any possible connexion between Pi, and q^. 

It is obvious in aU cases we shall have 


Pi + 2i==l. jPa + 22 = l. 

but the connexion between pi and pj is not immediate. We shall make the simplifying assump- 
tioi\ -which is perhaps most closely related to practical problems, and shall state that where 
nothing is known about the s - 1 trials preceding the sth trial, = P and P{f J = Q. 
Under this assumption we have 

3* ^ I g ■ 


P2“ 


This result is reached easily by noticing that 

P{E,} * P{EsP»~l} + P{Ps^S~l) - P{^S-l} 1 Ps~l) + P{^S^l) P{Es 1 ^B-l) 

whence P — Ppi-^ Qf%' 


8. The alternative hypothesis chosen to illustrate the power function formulae is that 
there is positive dependence in the sequence, i.e. ki is found so that 

P{k<:k^\Ha}<e and 1 - P{fc > | 

is calculated, when p^ > P. For economy of drawing, several power curves or what are really 
sections of a kind of power surface, plotted to coordinates P , p^, have been put together in 
the diagrams of Fig. 1. For example the bottom left-hand diagram shows for = fg = 10 
sections of the conditional power surface for P = 0-6, 0-6 and 0-75. When Hq is true and 
p _ yfiQ have the 6 % risk of rejecting -wrongly. As p^-P increases the chance of 
detecting the fact increases, hut in a way dependent on P. The other three diagrams show 
similar sections of the surfaces with = rj = 5, -with = 14, - 0 a’fid with == 7, = 3. 
In practice it will not be known what the value of P is, but the curves show reasonably well 
how the power of the test varies as P andpi (and thereforepa) vary. It is clear that the test 
for randomness under discussion is most powerful when the numbers of alternates are equal, 
i.e. when The power dechnes sharply when increases at the expense of r^. 

Another point which emerges is that the test is only moderately powerful, against the given 
alternate hypothesis tested, when r,+r, = 20, and it would appear therefore that if it 
was desired not to overlook a possible departureirom randomness in the form of positive 
dependence in the chain, then the length of the sequence should consist of at least 20 umts. 
The question of other possible tests we shall not discuss at this stage. 



Power of test Power of test 



338 A. poweT function for tests of rctndoonness in a sequence of cdternatives 


Soqneiico of 20, — 14, 1*2 = 6 Sequence of 10, _ 7, — 3 



Scale for P and pi Scale for P and 


Sequence of 20, ry = r^= 10 Sequence of 10, = fa = 6 



Scale for P and Seale for P and 


Pig. 1. Conditional power curves when the alternate hypothesis is positive dependence. 



339 


F. N. David 


9. It will be noticed that P{2i or 2J + 1 1 which have been loosely termed power 

function formulae are not power functions in the sense originally defined by Neyman & 
Pearson, but they appear to involve a justifiable extension of that idea. In order to dis- 
tinguish them from the usual meaning of the words power function, I shall refer to them as 
conditional power functions. The theory of the conditional power function may be stated 
briefly in the following way. It is assumed that all possible samples (or sequences) may be 
olassifled according to their composition. Suppose that there are h of these mutually exclu- 
sive classes, which are also the only possible, say C^, G ^, ..., Cj.. We have considered only the 
case where h is finite but it appears likely that the method can be extended to cover the case 
where /fi is enumerably infinite. These classes, Gi,G^, correspond to regions forming 

a partition of the sample space. 

Let Hq be the hypothesis tested and Wq be the critical region used for the rejection of this 
hypothesis. Given that a sample is in G^ (say), and that an alternate hypothesis is true, 
then the probability that will be rejected is 


P{JSeWoGi\EeGi,H^ = 


P{E6iooO,\H,) 

P{EeGi\H,} 


where Wf^G^ means the region common to and to Cl and, following the Neyman-Pearson 
notation, E is the sample point. Regarded as a function of this is the conditional power 
function of the test associated with Wq in the subset G^ of samples. 

The Neyman-Pearson power function, which we might call here the overall power function, 

will be fc fc 

P{Eew, I = S PiEm^Gi i - S PiP^.O^ \ EeG„ H,} P{E6G^ \ H,], 


which rnaj'" be looked on as a weighted average of the conditional power functions. 


10. There seems to bo no reason why Wq should not be built up of portions w^Oi, these 
portions being chosen to maximize each term of the summation, i.a.w^Gi chosen to maximize 
the conditional power function. For example, to revert to the specific case of randomness 
within a sequence with which we have been dealing, the different partitions of r ( = + r.f} are 

the mutually exclusive and only possible classes G^. It is conceivable, although practically 
not very likely, that for each of these classes there will exist a different test which is more 
powerful to detect specifically defined departures from the basic hypothesis tested than 
any other test. The decision as to which is the most powerful test, against the same specifically 
defined alternatives, to use for any given class will be decided by the conditional power 
function. Once this has been decided the procedure for the complete test of significance may 
be laid down. This will be : (i) count the number of alternatives in the sequence, i.e. find 
and r^, (ii) from (i) decide the appropriate test of significance to use, (iii) apply the test. The 
power of the test as laid down by (i), (ii) and (iii), in the usual meaning of the word, will be 
given by the overall power function. 

It is proposed to discuss these, and other applications of the conditional power function 
technique, in a further publication. I have been concerned here with trying to explain what 
I believe to be the basic ideas, and to forestall possible criticism that I am falling into error 
(of the third kind) and am choosing the test falsely to suit the significance of the sample. 


references 

Stevjbns, W. L. (1939). Ann. Bugen., Land., 9, 11. 

Wald, A. & Wolvowitz, J. (1940). Ann. Math. Statist. 11, 147. 
David, F. N. (1947). Biometrika, 34, 299. 



[ 340 ] 


A NUMERICAL SOLUTION OP THE PROBLEM OP MOMENTS 
By H. 0. HARTLEY and S. H. KHAMIS 
1. Introdttotion 

Given, a statistical variable x and its frequency distribution /(a;), then, under certain con- 
tinuity conditions for /(a;), the moments 

lir = jx'f{z)dx (r = 0, 1,2, ...) (1) 

can be evaluated for any integer r. For certain distributions /(a;) the integrations in (1) can 
be carried out analytically resulting in simple formulae for the moments. In general there 
is no inherent dif&oulty in obtaining numerical values for the moments by numerical 
quadrature. • 

The inverse problem is to find the distribution /(a;) given the moments This problem, 
commonly known as ‘The Problem of Moments’, has received considerable attention by 
mathematicians and is of interest in statistical distribution theory. There are numerous 
statistics for which it is difficult to obtain a formula of the random sampling distribution 
f{x) amenable to numerical evaluation. On the other hand, in such cases it is often possible 
to find simple formulae for the random sampling moments (Bartlett, 1937). Sometimes such 
formulae are available for all integer r; more often than not, however, is only known for 
a limited member of small r (e.g. r = 0, 1, A simple method of ‘determining ’/(x) from 
the given moments would therefore be helpful in such cases. 

Examples of variables of this kind are the numerous moment statistics or ifc-statistios for 
which random sampling moments can be evaluated, notably by R. A. Fisher’s (1929, 1930) 
combinatorial methods, whilst their exact sampling distributions are usually unknown. 
As related statistics we should mention here the moment ratios and used in tests 
for deviation from normality (Geary, 1947, Geary & Worlledge, 1947), For these, the 
low-order moments are known exactly. A similar situation arises with statistics defined as 
likelihood ratios, as, for instance, with the criterion required for testing heterogeneity 
in a set of variances. Moments for this statistic were obtained by Neyman & Pearson as early 
as 1931, yet, although approximations to f{Lj) have been obtained (Bartlett, 1937; Hartley, 
1940; Nayer, 1936; Neymanfe Pearson, 1931; Sukhatme, 1936; Welch, 1936, 1936), there 
is still considerable doubt about their accuracy in certain oases, and the exact formula 
obtained by Nair (1936) in the case of equal sample sizes is very complex. 

These and numerous other problems of distribution point to the necessity of developing 
a numerical technique to deal with the following situation: 

(i) A random variable x ranging between a and b (where a may be — oo and b may be -I- oo) 
has a distribution function /(x) known to have a continuous derivative of order n. 

(ii) The moments fb 

fir = j^xff{x)dx (r = 0,l,...,JR), (2) 

are known numerically to any decimal accuracy desired but for a limited number of positive 
integersr, viz. r = 0, 1, B. With the knowledge about/(a:) limited to the above conditions, 

is it possible to obtain numerical values for the probability integral P{x) = j f{x) dx 

J a 



H. 0. Hartley and S. H. Khamis 341 

depending on the moments only, and is it possible to make a statement on the accuracy of 
these values in terms of the derivatives of the function/(a;)? 

Problems of this kind have hitherto been treated principally in two ways: 

(а) When JJ = 2, 3 or 4 nothing better can be expected than a ‘good fit’, which is often 
achieved by fitting the appropriate Pearson-type curve. 

(б) With B in the neighbourhood of 6-8, expansions of the Gram Oharlier, Laguerre or 
Jacobi type have been used, either as cumulant or as moment expansions. Such theorems 
as are available for statements on the convergence and asymptotic behaviour of these 
expansions usually require too many moments to be known. Often the expansions are only ■ 
asymptotic, and unless the distribution is close to the generating curve (Normal for Gram 
Charlier, F for Laguerre), the results are often disappointing (see, for example, Kendall, 
1946, Chapter 6). 


2. Outline oe present method 


The method to be developed here is a direct application of finite-difference calculus and 
therefore provides both numerical answers to the problem, as well as gauges of their accuracy 
in form of remainder terms. The method is, in fact, closely linked with interpolation technique. 
When using any of the well-known interpolation formulae no mathematically rigorous 
statement on the accuracy of the interpolates can be made unless the magnitude of the 
remainder term can be estimated, and for this some knowledge about (say) the wth derivative 
of the function is required. Yet, in using such formulae the convergence of the difference 
table inspires confidence that ‘the results of the interpolation can be accepted as a working 
hypothesis’ (Milne Thomson, 1933, p. 62). Similarly, with the present method we shall give 
a numerical procedure of obtaining values of the probability integral. Certain checks of 
internal consistency will be described which inspire confidence that the answers are correct, 
but no rigorous statement on the accuracy can, of course, be made if this is to be based on 
a finite number of moments alone. The exact remainder terms whioh'we derive will entail 
the high-order derivatives and it is hoped, in a second communication, to derive 

some general statements concerning their order of magnitude. 

In order to simplify the argument we assume in this section that the range of x is finite 
{a and b finite). 

l*ar 

The aim is to determine the probability integral of x, P{x) — in tabular form, 

J O, 

i.e. we wish to determine numerical values of 

Pi = P(Xi) = jy{x)dx (3) 

for discrete values of x^. Tor convenience the group intervals Xi^i — will generally be chosen 
equidistant (group interval = h), and the number of intervals will be i? -f 1 , i.e. equal to the 
number of given moments (including fig = 1). Hence 

Xi = a + ih, A = (6-ffl)/(J? + l). (4) 

The first differences in the table derived from equation (3) are the quantities 

fi=Pi-Pi-x = {^ (s) 

J 1 

and are the familiar ‘ frequencies ’ in a grouped frequency distribution with equidistant 
intervals (see Tig. 1). The link between these frequencies and the exact moments y, is then 



342 


A numerical solution of the 'problem of moments 


established by the well-known formulae for Sheppard’s correction. Using Kendall’s (1938, 
1945) derivation and remainder term, but extending his notation, we have 

iJ+i 

+ ( 6 ) 

i=i 


where the centre points are given by 

^i-a + {i — l)h (i = 1, ...,i?-!-l). 


0{r, h) denotes Sheppard’s corrective term, viz. 



(7) 

(8) 


The aim, now, is to use equations (6) to determine the unknown f from the given /i,. 
To this end the remainder S{r, h) must be examined: Most distribution functions have what 
is commonly known as high contact at the terminals of the variate range. This means 
that/(a;), as well as all its derivatives up to order, say, m, vanish at both ends of the range, i.e. 

/«)(a) =/d)(6) = 0 (i = 0, (9) 

If for such functions we define /(a;) = 0 outside the range a4,x4.b, it will have continuous 
derivatives of up to order m for - oo < a; < -I- oo. It can then be shown that the remainder 
term is of the form (see, for example, Kendall, 1945, p. 69) 

h) = - h, 6fi {in even), (10) 

SAr, h) = 5«Vi(J)i”(-)(r, K dr) {m odd), (11) 

a^df^b, 

where the Bj arc the Bernoulli numbers, the Bf> are the Bernoulli polynomials of first order, 
the integrand function fc(r, h, x) is defined by 

r+Jft 

]c{r, h, x) = f{x + di, '( 12 ) 

J ~Vi 

and its derivatives with regard to x are denoted by In the subsequent sections we shall 
assume (9) to hold (contact of order m), but will discuss the case when (9) is not satisfied 
in § 10. 

The remainder term ^^(r, h) will usually be small (see, for example, Kendall, 1945, p. 72). 
We shall therefore, in what follows, ignore S,n{r,h) but will discuss the error thereby com- 
mitted in §5. 

If, then, in (6) we omit S{r, h) we obtain a system of R-+ 1 linear equations for the J? -(- 1 
unknowns /j ii+i , 

fi^^ = K+G{r,h) 

i=l 


( 13 ) 



H. 0 . Haetley and S. H. K hamts 


343 


The matrix of this system of equations say) is of the form | g | and has a classical deter- 
minant I 11 , sometimes referred to as Vandermonde’s determinant and well known to be 
=t= 0. The system can therefore be inverted once and for all and, for any particular case, the 
unknown /j can then be determined by substituting the right-hand sides of (13), i.e. ]i^ in 
the inverse matrix Denoting the elements of this inverse matrix by we have the 
system of equations 

fi= (14) 


ri=0 


Progressive addition of the/i yields the Pj from Pj — Z f* and therefore a table of P{x) at 


{=1 


interval h-. Finally, intermediate values of P(x) caii be obtained by standard mterpolation. 
Alternatively, as described in § 7, we may obtain directly a table of P{x) at intervar-|/i. 


3 . The standard form op the numerical inversion 

The ranlc of the original matrix is obviously equal to -H 1, i.e. the number of moments 
given, whilst its elements are the powers of the centre points It is desirable therefore that, 
for any given R, scale and location of the variable x be transformed into a standard form X, 
so that only oue matrix 1^ and therefore only one matrix need be calculated for each R. 
It is most convenient to standardize as follows: 


X = {x-i{a+b))^~ (iJeven), 

Z = (a:-K« + ^))f:^ + | (.Rodd). 


( 15 ) 

(16) 


It will be seen, therefore, that the range of Z Is i?+ 1 and the group interval 

H = Xi+x-Xi=l. 

From the given moments of x those of X {M^ say) about Z = 0 can, of course, be calculated 
by the usual binomial formulae, and in what follows we assume that values of if, are given 
numerically. Further, in analogy to (13), we have 


M, = Mr+G(r,l). 

From (15) and (16) we obtain for the new centre points 


St^-IR, ..., 0 , 

E-1 - R+l 

Sf — — 0, ..., 


for even R, '' 
for odd B, 


(17) 


(18) 


and the matrix Vji becomes j (i - 1 - \RY I or j (i - iE - i)" | . Thus, if the first six moments 
are given, we obtain for 


(19)' 


* It is, of course, possible to construct a matrix yielding the P, directly from the but we are here 
satisfied with determining the/^ first, as they are of independent interest. 

Biometrika 34 


1 

1 

1 

1 

1 

1 

1 

-3 

-2 

-1 

0 

1 

2 

3 

9 

4 

1 

0 

1 

4 

9 

-27 

-8 

-1 

0 

1 

8 

27 

81 

16 

1 

0 

1 

16 

81 

-243 

-32 

-1 

0 

1 

32 

243 

729 

04 

1 

0 

1 

64 

729 



344 A numerical solution of the ‘problem of moments 

In practice the important range of B, ■will be from 5 to about 8. The inverse matrix is 
given below, and it is hoped to give Fg ^ and Fr^ in a subsequent paper. The inverse 
matrix F6■■^ th® elements of which are denoted by C^y, can be written in the form 

Cifi = I (20) 

ra»0 


where UJ, = i.e. the are suitable common denominators of the U^o, (Jis, and the 


are given in the body of the schedule below; 







= 1 

Ml 


M, 

Ml 


Mj = multiplier of column 

i 

Cifi r = 

= 0 

1 

2 

3 

4 

5 

6 

1 

720/1 

0 

-12 

4 

16 

-6 

-3 

1 

2 

1204 

0 

18 

-9 

-20 

10 

2 

-1 

3 

484 

0 

-36 

36 

13 

-13 

-1 

(21) 

4 

36/4 

36 

0 

-49 

0 

14 

0 

5 

484 

0 

36 

36 

-13 

-13 

1 

1 

6 

1204 

0 

-18 

-9 

20 

10 

-2 

-1 

7 

7204 

0 

12 

4 

-16 

-6 

3 

1 

In order to use the above system 

of equations it would be necessary to compute the from 


the given using formula (17). It is obviously more convenient to evaluate, once and for 
all, a matrix U'ir giving the f directly in terms of the given Jlf,.. This matrix is given below 
for jfJ = 6: 


i 

fi 

Mo = l 

Ml 


M, 

Ml 

M, 

M„ 

1 

4 

0-000 379 

-0-011719 

0-002 344 

0-017 361 

- 0-005 208 

-0-004167 

0-001 389 

2 

/a 

-0-006227 

0-109 376 

-0-034896 

-0-162778 

0-072 917 

0-016 667 

-0-008 333 

3 

/a 

0-059101 

-0-083694 

0-618490 

0-263 472 

-0-244 792 

-0-020 833 

0-020 833 

4 

4 

0-891373 

0 

-1-171876 

0 

0-364 167 

0 

-0-027 778 

5 

/fi 

0-059161 

0-683 694 

0-618490 

-0-263472 

-0-244 792 

0-020 833 

0-020833 

6 

4 

-0-006 227 

-0-109 375 

-0-034896 

0-162778 

0-072 917 

-0-016 667 

-0-008 333 

7 

.A 

0-000 379 

0-011719 

0-002 344 

-0-017 361 

-0-005 208 

0-004107 

0-001 389 


( 22 ) 

Working rule: Eaoh/^ is obtained by forming the sum of seven products using the seven coefficients in 
the ^th line and applying them to M^, . • M,. e.g. = 0-000 0-011 7 lOMj +...+ 0-001 aSOAfj. 


4. Calculation of the incomplete jS-funotion 4(8,' 6) 

FEOM ITS FIRST SIX MOMENTS 

As an example for the above method we consider the Beta Distribution for p = 8 and 
^ f{x)==[B{8,Q)]~T^x’(l-x)\ 

Using the moments for this distribution about x = 0, /x,y = B(x + r, 6)/i}(8, 6) (r = 0, .... 6) 
and transforming to the standard scale X = lx~ 3-S, we obtain for the moments of X about 
X = 0: ilfi = 0-5, = 1-06, = 1-225, = 2-77426, il^ = 4-41360 and’ Afg = 10-56942. 

Substituting these iii the matrix (22) we obtain values of f whose progressive sums are 
showninTable 1 (calculated 4(8, 6)). These maybe compared -with the ‘ exact ’ values obtained 
(byinterpolation), from the Tables of the. Incomplete B-furictiori, (1934). The worst discrepancy 
is about 2 in the fourth decimal. Higher accuracy can, of course, be obtained if the number 
of moments (_B + 1) and therefore the number of 4 increases (see, for example, §8, 'where 
the normal curve is obtained to 5-decimal accuracy). 



H. 0. Hartley and S . H. Khamis 345 

A rather gratifying feature of the comparison is the higher decimal accuracy in the tails 
of the distribution. This is a consequence of the sensitivity of the higher moments to changes 
in the tail frequencies. Note also that the elements in the top and bottom lines of the inverse 
matrix (22) are much smaller than those in the other lines, so that any error in the right-hand 
sides of (13) has a smaller effect on the terminal/^. 


Table 1. Gomparison of ‘calculated’ and ‘exact’ values 0 / 4(8, 6) 


X 

X 

Exact 4 

Calculated 4 

Difference 10~® 

-3-6 

1/7 

0000 11 

0-000 09 

2 

-1-6 

2/7 

0-013 41 

0-013 64 

-13 

-.0-6 

3/7 

0-140 17 

0-139 95 

22 

0-6 

4/7 ■ 

0-489 63 

0-489 81 

-18 

1-6 

6/7 

0-862 61 

0-862 70 

- 9 

2-6 

6/7 

0-994 11 

0-993 96 

10 

3-6 

7/7 

1-000 00 

1-000 00 

0 


It might be argued that a further error will arise when determining intermediate values 
of 4 by interpolation in the ‘calculated’ table. This difficulty could, however, be overcome 
by shifting the grid of group intervals and using a standard X-scale with group end-pomts 
corresponding to the odd multiples of 1/14 in x, thereby obtaining 4 at points half-way 
between the arguments of Table 1. Such a method has actually been used in § 7. 

6. The kbmaindbb teem 

A formal representation of the remainder term is immediately obtained by reverting to the 
exact equations (6). If we are concerned with distribution functions having contact of order 
m at the terminals, the error contributions to the are obtained by substituting the i? 4 - 1 
remainder terms S„(r,h) ((10), (11)) in the inverse matrix It is convenient to use the 
standard variate X-scale, H = 1 and the matrix when it will be found that 

error /{ = S 1)> (23) 

r— 0 

where S„fr, 1) is given by (10) or (11) putting A = 1 and remembering that the integrand 
function fc must be taken in terms of the standard variate X, viz. 

i=(M,X).X-4‘/(^(^ + {))^4|. (24) 

Since the arguments (9, of 1, X) are unknown it will as a rule be necessary to substitute 

their respective maxima in (23), at the same time taking ] | in place of 1/^,. 

Although with (23) we have given a formal solution of the error term involved, in a manner 
similar to the remainder terms of interpolation formulae, it will in practice be difficult to 
estimate the magnitude of the error from this formula. It is hoped, therefore, to go into this 
aspect more fully in a second paper. 

6. Infinite vaeiatb eange and abtieicial tettnoation 

When the range of the variate is infinite, i.e. when a = — 00 and/or b = 4 - 00 , it is, of course, 
possible to transform the variate a: by, say, 2 / = y(a:) such thatthe range ofyisfimte. However, 
in general, we shall not be able to assume that the moments of y are known or that they can 

23-2 









346 A numerical solution of the problem of moments 


be derived from those of x. It is therefore necessary to adapt our method to deal with an 
infinite variate range. We shall treat here the case b = + oo, the case a = — co being identical 
and the case a — —oo and b — +co being analogous. 
iPor an infinite variate range, the condition of high contact is now replaced by 


lim/(^>(a:) = 0 (i — 0, 1, (25) 

a;->co 


which results in remainder terms analogous to (10) and (11)*. Similarly, in equations (6) 
which correspond to Kendall’s (1945) equations (3-40), the summation now extends from 

i = 1 to i = 00 , there being an infinity of frequencies /j = f{x)dx. Now since the 

J !C,_1 


exist we know that 
is convergent. Accordingly 

lim i; 

6->o!j i=n+2 



= 0 . 


(26) 

(27) 


if A = (6 - a)/(22 + 1). If, therefore, we denote the above sums by e(r, h) respectively we have. 


from (6), 


n+i 

S Ml + e{r,b)==ii,+G{r,h) + S{rJi). 
i=l 


(28) 


Applying now the previous method we introduce an additional error in the calculation of/j, 
but this error is smaller than + jjjax | e{r, 6) | S | | . 


The precise determination of the e(r, 6) for any given b would, of course, require a knowledge 
of the nature of the convergence in (26), i.e. some external knowledge about the distribution 
f{x) which we are seeking to determine numerically. Unfortunately, such knowledge will 
in general not be available. 

However, if b is chosen sufficiently large, the/j determined for different values of 6 should 
all yield, by the method of §§ 2 and 3, approximations to the same probability integral P{x) 
to within the errors of the respective remainder terms S{r, h) and to within the errors intro- 
duced by (27). In practice, therefore, one would make an intelligent guess at the likely 
range of b and then test for internal consistency by comparing the probability integral tables 
obtained by varying b over this range. This method, which is illustrated in § 7 gives an idea 
of the accuracy to which the integral has been determined, but no rigorous statement on 
aceuracy can be made without appealing to some a priori knowledge about /(a?). It is hoped 
to deal with this aspect more fully in the next paper. 


7. The calculation of the ^-distribution for 10 degrees of freedom 
As an illustration of the preceding section, we will now calculate the ^-distribution for 
10 degrees of freedom. This distribution has high contact at either terminal and, although it 
is known to start at a: = y = 0, we shall treat it as a distribution of double infinite range, i.e. 
we shall not make direct use of the information that /(a;) = 0 for a; < 0, and choose a truncated 
range a<a:<6. 

We have a mean of /{^ = 3'0843 2776, and the moments about the mean are given byf 
/4 = p-486 9223, /g = 0-080 6720, = 0-713 2999, = 0-386 6784, = 1-810 4865. 

* A formula for S(r, h) when the range is infinite will be given in the second paper. 

f These follow from the formulae for the moments about the origin which are ratios of /"-functions 
(see, for example, Kendall, 1946, p. 65). Note that we have used and /i' for moments about the 
origin and the mean, respectively. 



347 


H. 0. Habtlby and S. H. Khamis 

The standard deviation is ~ 0-7, and with seven group intervals available to cover 
the essential range -sve should choose h of the order of the standard deviation.* Our first 
attempt is, therefore, (a) h = 0-8. 

(а) If we make the mean of x the centre point of the innermost interval we have for the 
truncated range »• = /(!- 3-5 X 0-8 = /4 i- 2-8 and 6 = /ii+2-8. For the standard variate Z, 
the origin X = 0 will coincide with the mean of x and its range will be -3-5<Z< + 3'5. 
Calculation of the moments {M,) of Z and substitution in the matrix (22) yields the following 
answers for the frequencies /j: 

= 0-0005, /2= 0-03325, /3 = 0-26266, /4 = 0-42471, 

/s = 0-231 96, . /« = 0-042 06, /, = 0-004 87. 

The calculated frequency {f^) for the interval /ii+2-0<a5^/ii+2-8 is about 0-006, and its 
contribution to /te about 0-005 {2-4)® ~ 1. Since this is an appreciable proportion of it is 
unlikely that the frequencies beyond 6 = + 2-8 when substituted in (27) can be neglected, 

i.e. h and li are too smaU.-j- 

(б) Choosing therefore a larger A, we try 7i = 1, If we still keep the mean in the centre of 

the truncated range we have a = — 3-5 ~ —0-42 and b — /4^ + 3-5 = 6-58 ,(we know, of 

course, that /(a:) = 0 for a; = 0 so that our will really be the frequency for the interval 
0 < 03 < 0-086), This time the standard variate is Z = that — /«', and the above 

values can be substituted directly in the matrix (22) yielding the comparison of calculated 
;\;-integral and ‘exact’ x-integral as shown in Table 2. 


Table 2. Oomparison of calculated, and exact values of the ;^-wiefirraZ 


Z=x-//i 

P{x) exact 

P{x) calculated 

Difference 10“‘ 

-2-5 

0-000 06 

0-000 11 

- 5 

-1-5 

0-009 29 

0-008 93 

36 

-0-5 

0-244 66 

0-244 76 

- 9 

-f 0-6 

0-767 67 

0-767 85 

-18 

+ 1-6 

0-979 02 

0-978 88 

14 

-i-2-6 

0-999 46 

0-999 47 

- 2 

+ 3-5 

1-000 00 

1-000 00 

0 


The maximum error is about 0-0004 and, again, the terminal have a higher decimal 
accuracy. In practice, of course, the exact distribution would not be available for com- 
parison. This time the terminal value is about 0-0006 and represents the frequency for 
the interval /ij-t- 2-5<a;^/ti-t-3-6. Its contribution to is about 0-4, thereby confirming 
that the previous grid of group intervals was too fine. To obtain further confirmation on 
the tail of the distribution, we determine a third set of by shifting the grid of group intervals 
by 0-6 to the right, retaining the interval h — \. This wiU make a — — ^ and b = Pi + 4,, 

i.e. 0-08 ^ a: ^ 7-08. For our standard variate Z the origin wiU now coincide with -I- 0- 5. 

* An unsuitable choice of h would, later, fail to satisfy the checks of internal consistency, 
t Comparison with the exact y-distribution shows that the maximum error in the above is never- 
theless not more than 0-006. 












348 A numerical solution of the problem of moments 

The values of the are as follows: 

if, = -0-5, ifa = 0-7369223, = -0-7747114, 

14 = + 1-344 8394, JMg = - 1-834 794, = 3-595 7606. 

Substituting these in the matrix (22) we obtain the following values of/^: 

= 0-000 16, /a = 0-070 12, /3= 0-44430, /^ = 0-40421, 

/5 = 0-076 99, /6 = 0-00419, /, = 0-00004. 

The comparison of the progressive sums of the above with the exact ;\;-integral is of similar 
accuracy to that in Table 2. The terminal frequency for ,£ii + 3 ^ a; + 4 is 0-0006 with 
a contribution of about 0-03 to /Ug, indicating that we have now reached a satisfactory 
choice of b. 

As a final check on the internal consistency we compare the answers obtained with the 
two last choices of group intervals by merging the tables of P{x) to obtain one table at 
interval 0-6. This is set out in Table 3. The differences provide a fair check on the internal 
consistency to about 3-decimal accuracy of the two separate tables. If a more reliable check 
is desired, three or even four separate tables may be computed, all at the same group interval 
h and merged in the above manner to form a single table at interval or Ih. This procedure 
has the added advantage that interpolation difficulties at the wide interval of h are being 
avoided. 

Table 3. a: = y/or 10 degrees of freedom. Calculated table of P{x) obtained 
from two separate grids of group intervals {h~l) 


X-/ii 

P(,x) 




- 2-6 

0-0001 

1 



- 2-0 

0-0002 

87 

86 

442 

- 1-5 

0-0089 

616 

628 

601 

- 1-0 

0-0704 

1744 

1129 

- 174 

- 0-6 

0-2448 

2690 

966 

-1122 

0 

0-6147 

2632 

- 167 

- 866 

0-5 

0-7679 

1610 

-1022 

112 

1-0 

0-9189 

600 

- 910 

480 

1-6 

0-9789 

170 

- 430 

296 

2.-0 

0-9969 

36 

- 134 

103 

2-6 

0-9996 

5 

- 31 


3-0 

1-0000 





8. The special case oe symmetrical.distkibhtions; the normal integral 

By placing the origin of the standard variate X at the mean of a symmetrical distribution 
we obviously have — fj^, etc., i.e. the number of unknowns is halved. On the 

other hand, the odd moments contribute the meaningless equations 

^fii^i-^^R+z-i) = A/i X 0 = 0. 



H. 0. Hartley and S. H. Khamis 349 


Witih "fcliG iiiiiiil)6r of iinkno'wns GiUcI Gc^uQ/tioiis liSilvGd. "witti gvgh moinGiits ohIy rofftiiiod) 

it is iiGCGSsary to work out a now matrix say) based on even-order moments only. In 
practice the important values of i? are i? = 4, 6, 8 and 10, and we 4re giving below the inverse 
matrix (for B = S) having ranlc 6 (as there are hve equations corresponding to 
/li, /ig and fis): 


i fi 

1 A 

2 A 

3 A 

4 A 

5 A 


M^= 1 

Afj 

Mt 



0-000 3441 

-0-0017867 

0-0012163 

-0-0002316 

0-0000124 

-0-003 9874 

0-020 8333 

-0-013 7163 

0-002 3148 

-0-0000868 

0-0224161 

-0-119 0476 

0-071 1806. 

-0-008 1019 

0-0002480 

-0-0884281 

0-5000000 

-0-143 4028’ 

0-012 7315 

-0-000 3472 

0-569 6663 

-0-4000000 

0-0847222 

-0-0067130 

0-0001736 


(29) 

Working rule: Each A is obtained by forming the sums of five products using the five coefficients in 
the *h line and applying them to Afg, . . Af,; e.g. A = 0-000 3441Afo - . . . + 0-000 0124Arg. 


As an example we compute the normal integral from its first five even moments, /ig to /tg, 
choosing Ji= 1 and the standard variate X as normal deviate. Substituting, therefore, in the 
matrix (.29) Jlfg = 1, = 1, = 3, = 16 and = 105, we obtain the five A which in 

Table 4 have been progressively added to form the ‘calculated normal integral’ to be com- 
pared with the ‘exact’ one. The accuracy is remarkable, the maximum error being 15 in 
the 6th decimal. 


Table 4. Comparison of calculated normal integral with exact normal integral 


x = X 

Exact P{x) 

Calculated P[x) 

Difference x 10“® 

-4 

0-000 032 

0-000 034 

- 2 

-3 

0-001 360 

0-001 342 

8 

-2 

0-022 760 

0-022 766 

-16 

-1 

0-168 656 

0-158 643 

12 

0 

0-600 000 

0-600 000 

0 


With symmetrical distributions we cannot, of course, shift the grid of group intervals, 
as otherwise we would lose the symmetry relation between the A. If, therefore, intermediate 
values oiP{x) are required in order to ease subsequent interpolation, we can achieve this only 
by altering h. Merging the answers obtained from (say) three different h grids all centred 
at a; = 0 (e.g. h = 0-9, 1-0 and 1-1), we would not obtain a table of P{x} at an equidistant 
interval. In the internal check we would, therefore, use divided differenoes.- 


•9. Divergent or poorly convergent moments; the 

i-DISTUIBUTION FOR 10 DEGREES OE FREEDOM 
Some variates with infinite range have distribution functions with low contact at a; = co, 
i.e, the convergence in lira. f(a^ = 0 (30) 

X-^eO 

is slow, indeed, in some cases the moment is divergent for, say, r > B'. 

As an example we have investigated the t-distribution for 10 degrees of freedom. Here we 
have/(a:) = c(l -t-t^/lO)-^'^ and hence B' ~ 10. In this case, therefore, B' is known o priori. 
If no such mathematical information is available, warning of low contact is given by the rapid 




360 


A nuMericttl solution of the pToblstn of onovn&nts 

growtli of the moments as provided B is near to R'* For our example for the 

^-distribution we find 

/t4 = 6-25, 78-125, = 2734-375. 

The difficulty with such distributions is that artificial truncation is not justified if the 
high-ordcr, poorly convergent moments are to be used in equations (b). The remedy in such 
oases is the square variate transformation y® = x. Sometimes it may be necessary to use a 
higher power ~ x. Obviously, if w,e were to take an equidistant interval for y, the group 
integral for x will groiv witli the square law, thereby absorbing' the slowly convergent tail 
end of/(x). 

Now, obviously, the moments of y are simply related to those of x; we have 

affix) dx = 2 f y^^yfiy^) dy, (31) 

Jo Jo 

or introducing the new distribution function giy) = 2yf(y^), we have 

rco foo 

x’/(x)dx= y^''g(y)dy. (32) 

Jo Jo 

Applying now the previous method to g(y) it is further necessary to avoid using the poorly 
convergent high-order moments. In the case of the i-distribution, instead of taking 
r = 0, 2, 4, 6 and 8, we take the absolute momentsf for r = 0, 1, 2, 3 and 4, which, according 
to (32), correspond to the even moment of g{y). If only even moments about the origin are 
used in the determination of the /^, the matrix (29) gives the appropriate inversion. Using 
= 0-0 for the y-group interval we substitute in (29); 

Mo =1, Ma « 2-401906, M4 = 9-645 002, Mj = 62-952 032 and Mg = 372-108 863. 
We thereby obtain five values of/^ (i = 1, ..., 6) of the form 

/i= 9{y)dy=^\ fix)dx. (33) 

J (5-i)7t J {6-f)V 

The progressive sums of these are compared with the corresponding values of the exact 
Mntegral in Table 5. Although the accuracy is lower than in the previous example it is 
satisfactory and very much better than we could have obtained without apjilying the trans- 
formation y^ = X. 


Table 6, Comparison of calculated and exact values of the t-integral 


l 

P{() exact 

P(l) calculated 

Difference x 10"'* 

D-7U 

0-0001 

0-0001 

0 

3-24 

0-0044 

0-0042 

2 

1-44 

0-0902 

0-0906 

- 3 

0-30 

0-3613 

0-3648 

-35 

0-00 

0-6000 

0-5000 

■ 0 


* If J? is much sinnllor than R', the present difficulty will not arise at all. 

"We sliall show in a second paper that, if the absolute moments of a distribution are not known, they 
can be obtained by interpolation between the values of logy^ for r = 2, 4, 6, 8, etc.; in fact, we shall give 
a general di.soussion of the interfjolability of the logarithmic moment function for positive x. 




H. 0. Hartley and S. H. Khamis 


351 


10. Lack of high contact at the start of the variate x = a , 


^ e confine ourselves here to the most important ease of lack of high contact at one terminal, 
iay the start of the distribution, and assume, therefore, that there is high contact at one end 
)/ the range. 

Without loss of generality we assume that a = 0, i.e. a; ^ 0, and introduce the new variate 
f = x>h‘^2. Whence we have 


Ch /•6V* 

x’^f{x)dx= tfy(y)dy, 
0 Jo 


(34) 


pphere g{y) ~ Obviously g{y) has, at least, contact of order ^ - 1 at the start y = 0; 

further, if f{x) has contact of order m at a: = 6, i.e. if f{x] = 0{x~^) at x = b, then 
j{y) == Hence there is high-order contact, of order ^-1 and (m-ljHI, 

respectively, at both ends of the range. The previous method is therefore applicable to g{y) 
provided we can obtain its moments from those of /(a;). It is obvious from (34) that in order 
to obtain the ordinary moment of g{y) we require to know the ‘fraotionaT moments oif{x), 
i.e.’ those corresponding to r ~g{h {j = 0, 1, ...). If the moments of/(a;) are only known for 
integer r the fractional moments will have to be obtained by interpolation of the logarithmic 
moment function log/*,;, which will be more My discussed in the next paper. 


REFERENCES 

BARTtEra, M. S. (1937). Proc. Roy. Soc. A, 160, 268. 

Fisher, R. A. (1929). Proc. Lond. Math. Boo. (2), 30, 199. 

Fx-sheR, R. a. (1930), Proc: Roy. Soc. A, 130, 16. 

Geary, R. C. (1947). Biomtrika, 34, 68. 

Geary, R. C. & Worllbdge, J, P. G, (1947). Biometrika, 34, 98. 

Hartley, H. 0, (1940). Biometrika, 31, 249. 

Kendall, M. G. (1938). J.R. statist. Soc. lOl, 592. 

Kendall, M. G. (1946). The Advanced Theory of Statistics, 1. London: 0. Griffin and Co, 

Milne Thomson, L. M. (1933). The Calculus of Finite Dijferences. London: Macmillan. 

Naib, U. S. (1936). Biometrika, 30, 274. 

Nayer, F. P. N. (1936). Statist. Res. Mem. 1, 38. 

Neyman, j. & Pearson, E. S. (1931). BuU. ini, Acad. Qraxoxhe, A, p. 460. 

SuKMATME, P, V. (1936). Statist. Res, Mem. 1, 94. 

Tables of the Incomplete Beta-Function, edited by Karl Pearson (1934). London: Biometrika Office. 
Welch, B. L. (1936). Biomtrika, 27, 146. 

Welch, B. L. (1936). Statist. Res. Mem. 1, 1. 



[ 352 ] 


APPROXIMATION TO PERCENTAGE POINTS OF 
THE z-DISTRIBUTION 


By a. H. CARTER, King’s College, Cambridge 


Tables have been pubKshedofthe values of« for various percentage levels (20, 6, 1 and 0-1 %) 
for a range of given n^, {Fisher & Yates, 1943, Table V). When % or is outside the range 
of the tables, recourse must be had to approximate formulae (unless, of course, interpolation 
is sufficiently accurate) which wiU combine accuracy with facility of computation. One 
such formula, due to Fisher, with a modification suggested by Cochran (1940), is given at 
the foot of the above-mentioned tables. The purpose of this paper is to derive an alternative 
formula, no more difficult to compute, which will be shown to give consistently closer 
approximations to the true value of z for all except small n-^ or n^. 

Wishart (1947) has derived formulae for the exact cumulants of z, and also the weU-known 
approximations to them when %, are large. The exact cumulants as far as Xj can be readily 
obtained arithmetically from tables of the Polygamma functions. Knowing the cumulants 
of the distribution, we may make use of the Cornish-Fisher normalization functiorr method, 
based on Edgeworth’s form of the Gram-Charlier type A series (Cornish & Fisher, 1 932), to 
approximate to the percentage points. .The method consists in writing a as an exi^ansion 
in powers of a corresponding normal variate, the coefficients being functions of the 
cumulants of z, and assumes that is of order which is true for the ^-distribution 
(Wishart, 1947, p. 172). 

If a and g are expressed in standard measure (i.e. mean zero, standard deviation unity) 
we then derive ^ ^ xi2P-6^ 


correct to order n~^ for a'. This gives 
Formuh (u): + 


0-® 36 ’ 


( 1 ) 


correct to order n~* (since cr = 0{n~*)), where /ii(= k^), cr^( = Xg), Xg, x^ are cumulants of 
the z-distribution. The ^-coefficients may be readily computed; e.g. for the 6 % level, sub- 
stituteg = l-64486.Table2give8 the values, forthe 20, 5, 1 and0T%lev6l8, of the coefficients 
required in applying the formula. The quantities ji'i, tr, Xg, X4 depend of course on % and Tig, 
and may be evaluated in any particular case, whence substitution in ( 1 ) gives the appropriate 
value of a. Since | Zi_p(Tii,'a2) | « | Zp{n^, n ^) ), where Zp is the value of z corresponding to 
probability P, to find the percentage points for the ‘negative tail ’, i.e. 80, 96, 99 and 99-9 %, 
we may simply interchange ti^ and n^. This has the effect of changing the sign of the odd 
cumulants, so that in (1 ) we write - and — Xg for and Xg. 

Formula (a), being an approximation to order n~i, may be expected to give reliable results 
when ni and are both large. For the 1 % point, for example, we find a(6, 12) = 0-7843 
(true value 0*7864), whereas z(24, 60) = 0*3744 (true value 0*3746). Some further results for 
(6, 12), (6, 60), and (24, 60) are shown in Table 1 (a). 

In practicUi some labour is involved in applying formula (a), even if polygamma tables 
are available. The Fisher- Cochran formula, derived by the normalization function method, 
is a simple working approximation, valid for large tip n^, in which the exact cumulants are 
replaced by their approximations in terms of inverse powers of and Tig. 



A. H. Caetbr 


353 


Table 1 . Comparison of approximations to the percentage points of z 

Formula (6): Existing formula (Fiaher-Coohran). 

Formula (c) : New formula. 


6 , 12 6 , 00 24 , 60 20 , 36 20 , 100 36 , 60 


Per- 

centage 

level 

«■!, nj -J- 

20 

Formula (&) 
Formula (c) 
Trues 

6 

Formula (6) 
Formula (o) 
True z 

1 

Formula (6) 
Formula (e) 
Trues 

0-1 

Formula (b) 
Formula (c) 
True z 



Per- 

centage 

level 

Wj, nj -> 

20 

Formula (6) 
Formula (c) i 
True z | 

5 

Formula (6) 
Formula (c) 
True z 

1 1 

Formula (b) 
Formula (c) 
True z 

0-1 

Formula (6) 
Formula (c) 
True z 





36 , 20 

100 , 20 

60 , 36 


















Table 1 (a). Some values of z from formula {a) {exact cumulant formula) 

(For corresponding true values, see Table 1) 






























354 Approximation to percentage points of the z-distribution 

The cumulant function of z is 

ic{z) = -liilog J?+iogr(^j +iogr(^j -iogr(l) -iogr(^^j , 

and the oumulants are obtainable by differentiating this successively with respect to {it), 
at each stage putting t = 0. 


Since 




and log/’(^»i) can be expanded by Stirling’s theorem in inverse powers of n, the cumulants 
may also readily be expressed in inverse powers of n^-, and, when Hi, n^ are reasonably 
large, the first few terms only in the expansions will give sufficiently close approximations 
to them. In fact, writing 


1 1 


1 1 




it has been shown that 


K-^ — ii[ = — |ci — 65d + 0(w~^), 

Xj = 0-2 = ^s + J(s^+d2) + 0(n-3), 
K-i =-\8d + 0{n~^), 


(A) 


Ki = + 3d®) + 0{n-‘^), 

K,. = (r>l). 

Formula (a) will now have an extra term, since we take as our ‘working variance’ of z 
not its exact value, but its approximation to order -nr^ from (A), i.e. In the notation of 
Kendall (1945) = K^}\s~l^\{s + d^ls). 

We then obtain 






V2/s 


1 + 




or 


2/s 12 
d 


144\/ s 


/7a 


!(^H14) 




( 2 ) 


(3) 


where 




provided ^^d^^{2ls) (^^+11^) may be neglected (which will be the case for small d). 
Inserting the approximation to from (A), i.e. — i-d, 

f d 

-~(P + 2), 


\l{h-X) 6'*’ 

the Fisher-Cochran formula, which has, in fact, been found to give a fairly close approxima- 
tion to the true z for n^, both reasonably large. It may be noted that if are not very 
large, an improvement will be effected by including the second term in the estimate of the 
mean {k^), i.e. from (A) by adding —^sd. 

For (%,» 2 )= (6,12) this correction is -0-00347, and for (24,60), the correction is 
-0-00024. Inserting this improved approximation to in (3) we have 


Formula (6): 


z~ 




(36) 



A. H. Carter 


366 

As pointed out by Wishart (1947, p. 179), an approximation to the value of any K,(r> 1) 
obtained by considering its leading term only, wiU be improved by writing l/(«,i-l) and 
l/(» 2 - 1) ™ and l/wg. Eor, by Stirling’s expansion of a factorial, 

=2^'+2^ + 3;3+0(n-‘) 

and so on. 

Table 2. ^-coefficients required in applying formula (a) 



20% 

5% 

1% 

0-1% 

£ 

0'84162 

1-64485 

2-32636 

3-09023 


-0-04861 

0-28426 

0-73632 

1-42492 


-0-08036 

-0-02018 

0-23379 

0-84332 

A(2g»-6g) 

-0-08377 

0-01878 

0-37634 

1-21026 


Thus writing 






d' 


1 


% — 1 


n,-l’ 


(B) 


we might expect a better approximation to z to be obtained, corresponding to that of Fisher 
and Cochran, if we use s' and d' instead of s and d. 

Corresponding to equations (A) we have 

K^ = ls'-i^s'{s'^+U'^) + 0[n-^ 

-\s'd' + 0{n-% 

Ki^ls'{s'^ + W^) + 0{n-% 

K^=0{n^~^) (•r>l). 

For the mean, however, = k^ = — \d- \sd + 0{n-^), (4) 

= -^d' + |s'd' + 0(w-3), (4as) 

If w® is not negligible (relative to the degree of accuracy desired) /ij should therefore be left 
intheform — — l-sd. 

Proceeding as before, we obtain 


-Ih' 




whence 


'S'W-H 

2/ 24 

d' 


2\ g^-7g 
s'l 144 ’ 




(g^-1). 


(5) 

( 6 ) 


h' = 2ls', A' = i(g'-3). 


where 




356 ApproxiWiCttion to peTC&niagB points of the z-distnbution 

Since this is based in the first place on more accurate approximations to the cumulants 
X3 and K^, and since the term omitted from (5) in deriving^(6) (i.e. fiefs') — 7g), 

is evidently numerically less than the corresponding term omitted in obtaining the Ksher- 
Coohxan formula (i.e. iiidV(2/s)(g^ + llg), formula (6) might be expected to give an 
improved approximation to z. In fact, however, it does not, and the reason is not 
far to seek. 

Consider the expansion of ^lf{h--X) in both oases: 


V(fe-A) 


^h\ 


where the terms are decreasing in magnitude (since Ijh = = 0{n ^)). Hence the error 

in neglecting aU terms after the second will be approximately of the order of the third term. 

3A^£ 

Now in the Fisher-Cochran approximation this term, fhe same sign as the 

omitted term yjjd* + 1 1^) (both being of the same sign as ^), so that the extra terms 

included will tend to compensate for the term omitted. In obtaining (6) from (6), on the 
other hand, the term omitted, -^d"^ -1^), will be of opposite sign to g when 

I g j < .y/7, corresponding to a probability of about O-OOi; so that for most percentage levels 
encountered in practice, the error in (5) is increased in (6). 

A better formula is obtained from (6) as 


-6 (S 


( 7 ) 


where h' and A' are as in (6). 

Expanding + X')lh', it is found that the third term is now of opposite sign to and 
hence the extra terms contained in the expansion will tend to compensate for the term 
omitted. Since 5' and d' require to be calculated in applying this formula, it is desirable to 
write in the form (4a) (provided we can neglect quantities of order This gives 

Formula {c): (g2^2-2s'). (7a) 


CoUecting the results, we have the three approximate formulae: 
Formula (a) (exact cumulant method): 

Formula (b) (Fisher- Cochran formula): 


where 


a=l+l, d = --—, A = ^^. 

^ 1^2 s 6 


Formula (c) (new formula): 


r+- 


u-y ~ 1 ri^ — 1 , 


h 

, d’ = 


1 1 ,. 1 1 ,.2 


%-l n^-V ^ - s'' ^ 


P-3 


where 



A. H. Carteb. 357 

When n^, are large (and not too different), ^sd is negligible and formula (6) becomes the 
formula more generally quoted 


which may be written 
Similarly, for sufficiently large 


§ 



-1 

|gH2 

K 


' 6 

/I 

1\ 

l(A-i) 




V(^-A)' 

%> ^a> may be neglected and formula (c) becomes 


gV('^' + A') 1 

f 1 M 


h' \ 

W(^'+A') 

(%-l n^-l) 
( 1 1 ^ 

1 6 ' 

|(A'+i). 

h' ' 

1 n,^—\j 


It is to be noted, however, that since ^a'd' is approximately twice ^sd, more care must be 
exercised in deciding to neglect it. For example, when (wi,W 2 ) is (20, 100), ^s'd' = 0-0009, 
and for (24, 60), its value is 0-0005. 

For purposes of comparison, values of z have been computed from formrdae (6) and (c), 
for the four common percentage levels, over a fairly wide range of n^, n^. They are shown in 
Table 1, together with the corresponding true values of z. The latter were obtained .where 
possible from the tables of Fisher and Yates: elsewhere by inverse interpolation in Tables of 
the Incomplete Beta-Function followed by a logarithmic transformation. Such values are in 
error by not more than 0-0001. It will be seen that neither formula yields very accurate 
results when n^ or is as small as 6, though even here the new formula is rather better with 
the single exception of % = 12, — 6. In actual practice, however,- we are concerned with 

large values of n^, beyond the range of the published tables. Considering only those cases 
where and are both greater than 20, it is seen that formula (c) gives a consistently closer 
approximation than does formula (6) for both the positive and the negative tails, and for 
all the percentage levels investigated, though its relative gain in accuracy is greatest at the 
1 and 0-1 % levels.. It may be noted, in fact, that in no case considered having % and n^ 
greater than 20, is the error more than 9 in the fourth decimal place, i.e. it appears 
that for aU except small n^, this formula will give an approximation to /correct to 
within 0-001. 

In conclusion, therefore, it is recommended that formula (c) be adopted for general use, 
since it is no more difficult to compute, and is more accurate, than the existing formula. 
Dropping the dashes we have the formula 


where 


or, if 




3\{n^-lf (%-!)= 


+ ( I 

h ^-1 W2-l/\ ^6 3/’ 

- M ^ I --I \ 

— l «'2— 1’ s' 6 

may be neglected. 




/ r 1 ^ . 



Tk values of i and il for the four percentage levels are: 



n 

«y. 

l"/o 


5 

0-8416 

1-6449 

2-3263 

ma 

1 

-0-3819 

0-0491 


W)915 


My thanks are due to Dr J. Wishart, whose suggestion was the basis of this paper, 


REFERENCES 

Cochran, W. 6. (1940). Note on an approximative formula for significance levels of g. Ann, Math, 
SMk 11 , 93 . 

Cornish, E. A, & Fisher, R, A, (1937). Moments and cumulants in the specification of distributions, 
' M InsL Int, Statist, 4, 307, 

Fisher, R. A. & Yates, F. (1943). Statistical Tahks, 2nd ed. Oliver and Boyd, 

Kendaiij, M. C. (1946). iteed Tkm/ of Statistics, 1, 166, 2nd ed. C. Grifiin and Co, 

Wishart, J. (1947). The cumulants of the 2 and of the logarithmic f and t distributions. Biomtrik 
34, 170, 










[ 369 ] 


MISCELLANEA 


Note on the cumulants of Fisher’s z-distrlbution 


By LBO a. AROIAN, Hunter College 

In a recent article Dr J . Wishart (1947) stated: 'Explicit expressions for the exact oumulanta of Fisher’s 
^-distribution do not appear ever to have been published.’ Fisher’s z-distribution and the related Snede- 
oor’s F-distribution formed a part of my doctor’s thesis and rather full results concerning the cumulants 
of the z-distribution and other properties of the distribution were published in the Annals of Mathe- 
matical Statistics (Aroian, 1941) some tune ago.* I should like to take this opportunity of adding certain 
comments on the Gram-Charlier Type A approximation to the z-distribution and the ts^pe III approxima- 
tion to the F-distribution. 

To obtain the cumulants of the z-distribution I expanded the moment generating function Mg{6) in 
powers of 6 and found the kth. semi-variant (or cumulant) of z as the coefficient !. The exact 

results correspond with Wishart’s formulae (9) to (16), although given in a different forrn, and need not 
bo repeated here. In addition, asymptotic formulae for Aj,.„ % and nj large, were derived by means of 
the Enler-Maclaurin sum formula. Furthermore, another type of formula could have been ^ven for 
«■! small but Wj large, merely by expanding that part of Aj,., in which n^ occurs by the Buler-Maclaurin 
sum formula. The special cases for the logarithmic the logarithmic t, and the logarithmic normal 
probability functions follow by substituting the proper Inniting values of and n^. 

In my previous paper I was overcautious concerning the type A approximation to the z-distribution. 
Actually the method is fairly accurate although tedious. Taking 


J’(t) =^5(t)-|-A3?4"‘(«)+A, 


we have 


J“j?’(t)d< = 




F{t) dt = '>], 




whore rj is usually O-IO, 0'06, 0'026, 0-01, etc. As an example take = 24, = 60; then 

Ai,, = - 0-0127429, O', = 0-173779, A,,, = -0-0007998, Aj,, = 0-0000867, 


A, = 0-026346, A^ = = 0-00396. 


3!o-,' 


41(7'; 


for ij = 0-06 is 1-60094, Zo .05 = 0-26647 against the accurate value of 0-26634. For the 1 % point 
<0 = 2-2338, Zo,oi = 0-3764 against the accurate value of 0-3746. When rii == rij = 24, Zo,o 5 = 0-3423 
against the aqourate value of 0-3426. 

The type III approximation to the F-distribution is of some interest since for Jij moderate and n^ 
largo, rtiF tends to be distributed as x’' ‘'^th degrees of freedom. Since 


Mean 


TIq I • 

F = F= (T, = ^ / 

nj — 2 ,nj — 2 V 


2(ni-f na-2) 
Mi(a,-4) 


<*sir 


4(2ni-fna-2) / ni(nt-4) ^ 
ni(nj— ’6) fj 2(?ix+w»— 2)’ 




we find the 6 , 1 or 0-1 % points for F by using 

-P'0-05 = F-l-a-j,(l-64486-i-0-28392«8.,-0-04902a|.j,), 

-P’ 0-01 = F + o-j,(2-32636-h0-73330a,,,-0-024967al,,), 

-P' 0.001 = ^’+ tr,(3-0903 -P l-4190a,,, -P 0-05667al,,). 

* [Both Dr Wishart, as author, and myself as editor regret that owing to yrartime preoccupation the 
publication of Dr Aroian’s 1941 paper was overlooked. E.S.F.] 

Biometrika 34 



Miscellanea 


360 

These formulae for the levels of sigmficance of the distribution are from a previous paper (Aroian, 
1943). for = 24, = 60, Pq-os this approximation is 1-709 compared with the accurate value 

of 1-700. Forni = 24,% = 100, by this approximation is l-63lagainfittheaccurat6 value of 1-627. 
For ni = na = 100, J'o.os by this approximation is 1-394 as compared with the accurate value of 1-392. 
While these results are not too poor, they are not so accurate as the well-known formulae of Cochran- 
Fisher or of E. Paulson (1942) which, for large values of and Mj,, generally give 4 signiHoant figures. 

REFERENCES 

Aboian, L. a. (1941). Ann. Math. Statist. 12, 420. 

ABOiAir, L. A. (1943). Ann. Math. Statist. 14, 93. 

Paulson, E. (1942). Ann. Math. Statist. 13, 233. 

WiSHABT, j. (1947). Biometriha, 34, 170. 


A note on the mean deviation from the median 


By K. B. NAIR 


For samples drawn from a normal universe, Godwin (1945) obtained the sampling distribution of the 
mean deviation when the individual deviations are measured from the sample mean. It is well known that 
the mean deviation is least when it is measured from the sample median.* Let us refer to them as ‘ mean 
deviation from mean’ and 'mean deviation from median’ respectively, and rrso the letters m and m' to 
denote their sample estimates. 

The exact sampling distribution of m being now known and its probability integral tabulated, the 
question may well be asked what the distribution of m' is. Since m'<m, their expectations have the 


same relationship 




( 1 ) 


For samples of n from a normal population with standard deviation,, cr, E{m') = J'„(t and E{m) = /„cr, 
where /'</„. For getting unbiased estimates of cr we should divide m,' hy f'„ and m by What we are 
now interested to know is which of the two estimates has a smaller standard error. In the case of m, it 
has been shown by Ilehnert (1876) and Fisher (1920) that 


/„ = 


J 


2[n-l) 
nn ’ 


( 2 ) 


and 


„ /m\ cr irin 


-n-f sm~ 




n— 1| 


(3) 


In the case of m', we neither know /' nor the standard error of {m'jfl,) for samples of size n. 

It is obvious that when n is very large, the mean and median will differ very little from one another 
and hence m' and It is interesting to note that, at the other end of the scale, namely, when 

n= 2, m and w' are identical, and equal to one-half the sample range. 

To discover any real difference that may exist between the standard errors of (m//„) and 
which is the same as determining the difference between the coeffioients of variation of m and m', wo must 
consider samples of size greater than 2. 

(i) Let us take n = 3, and let x^, x^, be the observed values arranged in order of ascending magnitude. 
We at once find that , , , . 

m' = ^(*3-%). (4) 

The distribution of m' for samples of 3 is therefore derivable from that of the range. The probability 
integral of the range has been tabulated by Pearson & Hartley (1942) for n = 2 to 20. For our purpose 
it is necessary only to know the values of the mean range (w) and the standard error of the range (<r„) 


* When n, the sample size, is an odd number, the sample median is by defi.nition the value of the 
Hn-\- l)th ranked observation. When n is even, the sample median is conventionally taken os the mean 
of the ^nth and 2)th ranked values. The mean deviation from the median will have the same 
magnitude whatever value, between the Jutb and J(n-l-2)th ranked values, the median takes, when 
n ia ftven. No coYnnlip.fl.t.inTi fa f.VjArofnro ^ j..* i . -j-. _ .r n _ 



Miscellanea 


361 


for samples of 3. This can he calculated, correct to six decimal places, from certain numerical values 
given by Pearson (1926). Using his figures. 


w = 1-692668 X ff. 


<r„ = 0-888368 x (T. 

The value oF/^ for sample of 3 is, therefore. 


(5) 

( 6 ) 


/a = I X 1-692568 = 0-56419, 


and the standard error of w'/Zs 

correct to five decimal places. 

The corresponding values for /a 
equations (2) and (3) and are 


0- 888368 

1- 692668 


ir = 0-B2488O-, 


(V 


and standard error of {m/fg) are obtained by putting n = 3 in 


and 


s.B. of (W/s) = 0-624860-, 


( 9 ) 


correct to five decimal places. 

Although (9) can be evaluated to any number of decimal places, we are not in a position to bring (7) 
to a higher order of accuracy than five decimal places. It is very unlikely that (7) and (9) are absolutely 
identical, but we may safely conclude that they are practically the same. 

(ii) We next come to samples of 4. If x^, x^, Xg, be the observations arranged in order of ascending 
magnitude, the mean deviation from median is given by 


= i{*i + «a-»2-®i)- (10) 

The distribution of m' follows immediately from ‘some order statistic distributions for samples of 
size 4’ obtained by Walsh (1946) and is as follows; 

p(mOd»!.' = y ^ er^v dyj dm'. (11) 

The probability integral of m' is given by 

rm‘ I (’4m' 1 / 2 f^m' „ \3 

P(«') = J ^ p(m')dm' = y ^ • (12 

The values of P(m') given by (12) can easily be evaluated using the normal probability integral table 
and are given in cols. (3) and (6) of the table below, alongside corresponding values (given in cols. (2) 
and (6)) for the probability integral of the mean deviation (w) from the mean, for samples of 4, copied 
from Godwin’s (1946) tables. 


Table giving the probability integral of the mean deviation from (a) mean and (b) median 
for samples of four observations from a normal universe (cr = 1) 


m (ov m') 


P{m) 


P{m') 


m (or m') 


P(m) 


P(m') 


0-0 

0-1 

0-2 

0-3 

0-4 

0-6 

0-6 

0-7 

0-8 

0- 9 

1 - 0 
1-1 
1-2 


0-00000 

0-00333 

0-02634 

0-07879 

0-16693 

0-28345 

0-41662 

0-64836 

0-66934 

0-77040 

0-84860 

0-90502 

0-94321 


0-00000 

0-00398 

0-03003 

0-09204 

0-19139 

0-31818 

0-46629 

0-58961 

0-70692 

0-79964 

0-86962 

0-91888 

0-95162 


1-3 

1-4 

1-6 

1-6 

1-7 

1-8 

1- 9 

2 - 0 
2-1 
2-2 
2-3 
2-4 


0-96758 

0-98229 

0-99073 

0-99634 

0-99776 

0-99896 

0-99953 

0-99980 

0-99992 

0-99997 

0- 99999 

1 - 00000 


0-97229 

0-98475 

0-99192 

0-99588 

0-99798 

0-99905 

0-99967 

0-99981 

0-99992 

0-99997 

0- 99999 

1 - 00000 






362 


Miscellanea 


(13) 


(14) 


We note that although m and m' have an infinite range from 0 to oo, their probability integrals rapidly 
approach unity, this value being reached to five decimal place accuracy when = 2-4u. We can 

approximately work out the moments of the two distributions from the table above. The values of the 
mean and the standard deviation (applying Sheppard’s correction for grouping) of m and m' so obtained 
are given below: 

Mean: ^ ^ O'BQOfiSdo*, m' = 0"663187cr,'l 

Standard deviation: <T„ = 0’297015o', cr„. = 0'292979tr, | 

CoefiBoient of variation: Cm/ia = 0‘429842, or^.jW = 0‘441776. 

The values of w and (Tjm obtained from the exact formulae (2) and (3) are 

I 

W(i"’+8®"H + 2V2-4) = 0-429842, J 

showing close agreement with the values given in (13) for the mean and coefficient of variation of rn. 
We may therefore consider the mean and coefficient of variation of m', approximately evaluated in ( 1 3 ) , 
to be of sufficient accuracy to warrant the conclusion that, for samples of size 4, the mean deviation from 
the mean leads to a more ‘efficient ’ estimate of the population standard deviation than the mean devia- 
tion from the median. As the distribution of the latter is not known for n > 4, we are not in a position to 
say whether this conclusion holds good, in general, for all values of n. 

In conclusion, it seems worth making the following point: 

[a) if expressions for the expectation and variance of mf were available and tables of its probability 
integral worked out, 

(&) if the efficiency of the m' estimate compared to the m estimate for n> 4 was nob appreciably worse 
than for the case n = 4, 

there would be strong practical grounds for using m' rather than m in view of greater simplicity in calcula- 
tion. In both oases we must first arrange the observations in order of magnitude. Then if 
m' may be calculated from the formula 


w' = -{(®„_m + »n-i+a+ ■■•+»«)- 


•(a;i-fa!8-l-...-ha!()}, 


(16) 


where 1 or i(M— 1) according as n is even or odd. 

For m, however, we must also calculate the arithmetic mean x and look for and between which 

5 lies. Then m can be obtained from one of the three formulae 


nm _ a:i-|-a;j-t-...-ba;j. 
2fc k 

wm _ x^+i + -.. + Xn _ 

2{n—k) n—k 


(16) 


2k{n — k) n~k k ') 

This certainly involves a rather longer process. 

It is interesting to note that mf becomes a special case of the measure of dispersion baaed on difference 
between the sums of the first and the last r observations (in order of magnitude) suggested by Jones 
{ 1946), the range, becoming another special case of the same measure, when r = 1. 


REFERENCES 

Fisher, R. A. (1920). Mon. Not. R. Astr. Soo. 80, 758. 

Godwin, H. J. (1946). Biometrika, 33, 264. 

Helmert, W. (1876). Astr. Nachr. 88, no. 2096. 

Jones, A. E. (1946), Biometrika, 33, 274. 

Peahson, E. S. (1926). BwmMrika, 18, 173, 

Peabson, E. S. & Habtlby, H. 0. (1942). Biometrika, 32, 301. 
Wadsh, j. E. (1946). Ann. Math. Statist. 17, 246. 



Miscellanea 


363 


On the method of paired comparisons 


By P. A. P. MORAN, Inatitvie of Statistics, Oxford University 

M. G. Kendall & B. Babington Smith (1940) have discussed the ‘method of paired comparisons’ for 
investigating preferences. Suppose we are given n objects A, and an observer is asked to choose 
between every pair. If A is preferred to B we write A-^B. If the observer is not completely consistent, 
either because of his own ineflScienoy or because the objects are not really capable of being ranked in 
respect of the quality under consideration, he may make preferences of the type A ->■ B -> C ->■ A, and we 
call this an inconsistent or circular triad. Write d for the number of circular triads in a given e^eriment. 
Then Kendall & Babington Smith show that 

24d 

g = 1 — - (n odd) 


= 1 -- 


n^~n 

24d 




-4ri 


(n even) 


may be regarded as a ‘coefficient of consistence’ and lies between 0 and 1 , being capable of attaining 
both these limits. 

Now suppose that eaoh comparison is made at random so that there are equal chances that A -+B 
and B-rA. The distribution of d is then of interest. They calculate this distribution exactly for 
71. = 2, . . ., 7 and conjecture that its momenta are given by 

1 /n 




16 V3 


/t3 = - 


32 \3 


(w-4), 


these being polynomials in n which agree with their numerical calculations for n = 2 7. They also 

conjeotiore that the distribution tends to normality when n increases. In the present note we prove 
these statements. 

Let the objects be numbered from 1 to n* Write = 1 if the triad {%3i ciroulari and Paic^ ^ 
if it is not. Then d = EPhh, the sum being taken over aU such triads. Now by enumerating the various 

oases we see that = i and so n[{d) = ~ Now consider /i'(d) = ^[(PP^t)*]. 

Consider the types of terms which results when we expand this. Bi the fli’st place we have terms 
typified by PJ 23 , and these contribute j to /i^id)- Similarly, we have terms typified byP^sPiis, 
Pisa^’ni and PxjsPise, and the number of these are respectively ^ {n~ 3) (w-4), 3^”j (n-3) and 

j , whilst their expectations, are each It follows that 


n — 3\ 
3 



Miscellanea 


364 

The calculation of /tgld) is a good deal more complicated. n'^{d) = E(,(SP„^Y], and on expanding we 
get 16 types of terms, typified by 

Plasi ■^'^23^^124' 


P\->,!t-Pii&Pw!> PnaPusPiBi’ PitsPiaPue’ 


■ 128: 

^las-^'us-^ 1S7> 
PlSsPlllPuil 
Pi^'aP lisPftl 


pi p 

■4 las-*- 160' 


123 •'^146' 

"loa-PiJi-fiasi 
PinPmP I3i> 

123^1451 6781 ^laa^lOl-^OO?' 

After some calculation wo find the sum of the contributiona of those to bo 


P mPusPetv 


P mP 21B'P840> 


PiiiPiiiPiint PisaPiMPiS^i P^ 


itaPdiaPtia- 


|L j {n« - 6n' + I3w« + 42n=' - ISSw^ - 108n + 864}. 


2304 

3 /n\ 

Eedueing to the mean we get j(t 3 (d) = — — I ^ 1 (n — 4). 

The calculation of is a great deal more complicated, there being 86 terms which are not zero; 
we finally obtain, after lengthy calculations, 

{H,«-9ft® + 337i’ + 45H''-5827i= + 504n<+5732n3- 10C92u2-30024n+80352}, 


and so 


/k- 


1 In 
55296 U 


{972n= + 972n.2- 36936n.+ 80362}, 


which reduces to the conjectured result. 

We now prove that the distribution tends to normality, 
p, 110) to prove that 

♦ /^gw+1 A f 

ftp 2’^ml' ’ 


To do this, it is sufficient 
in = 1,2 


(Kendall, 1943, 


Consider the second of these first, Write = Puk—i- Then 


It is clear that for any given 7?i we could calculate given sufficient labour, by expanding this 

and considering the expectation of each type of term and calculating the number of times it occurs, which 
will be a polynomial in n. Now consider the various types of terms in the expansion of (SQfjk)^”''*^^- We 
classify these terms according to whether the Q’h havo common suffixes. Let QnkQimtf-Qniir 
typical product in the expansion. If this can be separated into p groups of products of Q’s such that 
different groups have no common suffixes whilst within each group the triads arc connected to each other 
hy having common points, we shall say such a product ‘contains p groups ’. Moreover, tho number of 
times such a term occurs will be a polynomial in n whose order is equal to the number of distinct suffixes 
occurring in the product. If in a group a suffix only appears once, the inconsistency of the triadcontaining 
it is unaffected by the remainder of the group and the expectation of the product of Q’s in that group will 
be zero. It follows that in all those terms which contribute something non-negative to none of 

the groups can contain a suffix which appears only onco. Therefore, since all tenns which contain more 
than m groups will have at least one group consisting.of a single Q, tho expectation of such tenns will 


be zero. It follows that is a polynomial in n, of degree 3 ot -f 1 = 


- 3 

integral part of -{2»i.-i- 1 ) 


at 


most, whose coefficients depend on m only. But ^ 2 (ii) is of degree 3 in n and so 


/{JCn+D "" 

N ow consider /tjw This is apol 5 momial in n, and our aim is to find the order and coefficient of the term of 
largest order , In the first place wo need only- consider terms with m or less groups, for if a term has more 
than m groups, one at least will consist of a single Q and the expectation of the term will be zero. More- 
over as before, in each term, the suffixes in each group must each occur at least twice in that group. The 
number of times eacl) type of term occurs will be a polynomial in ti of order equal to the total number of 
distinct suffixes in that term. As we shall show the leading term in fij,^{d) to be of order 3w, we can neglect 
terras whose frequency is less than this and therefore we can neglect all terms in which a suffix appears 
more than twice. Now consider a term with fewer than m groups and therefore containing a group of 



Miscellanea 


365 


order greater than two in the Q 's. As no suffix can occur more than twice, no Q can occur more than once. 
Consider any Qtjk* ®ay , of this group. Then either the suffixes t, k are common to three other triads or 
one, i, sy, is common to another triad andy, k common to a third. In either case evaluation of the 
expectation shows it to be zero. We can therefore restrict our attention to the case where there are m 
groups each containing two triads. Such groups can only be of the form QniQxti, QmQm and the 
expectations of the two latter are zero whilst the expectation of is 




The number of groups is ni and the number of ways of choosing m such distinct pairs out of 

(2??t) ! 

ia so that the leading term in is 

2"*f« ! o r am 


{2wi)! 

2«m! 




whilst the leading term in /ij is and so 




The distribution therefore tends to normality. 


( 2 to )! 

2*»m! * 


REFERENCES 

Kkndall, M. G-. & Babington Smith, B. (1940). Biometrika, 31, 324. 

Kendall, M. G. (1943). The Advanced theory of Statistics, 1 . London: Charles Griffin and Co. 


Notes on the calculation of autocorrelations of linear 
autoregressive schemes 


Br M. H. QUENOUILLB 


1. Bartlett (1946) has recently shown how, for a series of observations, we can test whether the 
observations can be adequately represented by a linear autoregressive scheme 

Wn+J + +... + O, (1) 

where the a^ are known or fitted values, and is an error component independent of Bartlett’s 

test is baaed on the formula ^ 

°ov(rj,?-5^.,)~- S PiPw’ 

71 

where r, is the estimate of the true autocorrelation p, between U( and 

00 

The purpose of the note is to demonstrate how, using generating functions, p,- and S PiPi+i 
be calculated with the minimum of computation. 

2. The method of generating functions seems to have been used by Wold (1938), who applied them to 
finding the variances and covariances of linear forms of finite extent in variables such as We shall, 
however, be concerned with linear forms of infinite extent. 

It can easily be shown that the solution of (1) can be written 

M„ = e„ + 6ie„_i + 6ae„_2+..., (2) 

where (l + aif + ...+apO“' = l + + + 

For example, if Wn +2 + “'“n+i + = ®n+8i 

sin 2^ sin 3^ 

we have (l + at+fet®)-" = (l-- 2 a:eosd+a; 2 )-i = +.... 


where 


cosd = — » = fi/b, 

sin W. 


, „,sinZ0 261’ 


sin0 .y/(4b-a2) 


and hence 



366 Miscellanea 

3. Using this generating function, we have 




n-i-i 


( 4 ) 


Now the expansion of (4) can he achieved by splitting into partial fractions and, in general, we can let 
o-»- 1 before this operation is performed. Thus 

P 13^ -h *r • 4- 




(I + c(il"l- . .. 4-ayl^) + ^4" "f Oy) 

and. using p, = —p_y, we can see that 

A, = -B,la, ( 4 = 0 ) 

= Uy — (4= 1, ...y" 1), 

Thus the autocorrelations will be generated by 

i fly 4- + ■ ■ ■ + I 4- + ■ . . 4- 


P 4* cty 4” ■ ■ * 4" oty 


(5) 


1 + 


jdj) 1 "h Oyi "h ... 4"^^^^ Affi l-pOyt ^ 4" * . • 4" Cty 


( 6 ) 


where the first term is expanded in powers of t and the second term is expanded in powers of 

00 

4. The expression (6) can now be squared to give a generating function for S PtPi+t- If 'will be 
necessary to split ... 4- iB,l^-»4- ... 4- B,) 

(1 4” %f 4" ••• ’hdjt’) 4" ••• 4‘ny) 

into partial fractions, but the labour’ will be reduced since the matrix of the coefficients of the equations 
in Bf will be unaltered. 

6. To illustrate the method, we can consider Kendall’s series 1, which was used by Bartlett in his 
example. 

The autoregressive scheme for this series is 

w„+2~1'1m„+i4-0-6m„ = e„.^„ 

t“ _2Bg4-(Bi4-2-2Bj)f + 

(l~l-lt + 0-5fl)(t^~htl+0-5)~ l~hU+0'6l‘ ‘^r*-M14-0-e’ 


so that 
where 


(7® S = ' 

■i— —CO 


■SiT= 


-3-7692 2-1164'l 

2-1164 -1-4423 




■ 2-1164'1 . 

- I-4423J 


Thus 
and 
so that 


(T^ = 2-8846, 

“ , 0-7333-0-6t 1 0-7333-0-64-1 

!=.-« 1 — 1-J 


1 / 4 - 0 - 642 ^ 1 1 - 1 - 14 - 14 - 0 - 64 - 2 ’ 
p<~l-lpy_i4-0-6py_j = 0 (4>0). 

If we now consider the square of the expression (7) we have a product terna 

24(0-7333 - 0-54) (0-733 34 - 0-6) -2B,4-(Bi 4-2-2Bj)4 Bg4-Bi4 

(1-1-144-0-542)(42-1-144-0-6) 1-1-144-0-642 ”^42- I-I44- 0-6’ 

I'BiJ = 1^- 3-7692 2-1164-^ p - 0-7333-[ 


( 7 ) 

( 8 ) 


1-1164 -I-4423JL 1-6764J 

r 0-6686*1 

L-0-72IoJ’ 


where 



Miscellanea 


367 


and, if we write 


S PtPH,/ S Pi 

t=~CO I {=.-« 


S Pt S — l + 2< 




0- 7333 -0-6t 

1- l-l{+0-6iS 


+ t> 


/ 0-7333 -0-6{y 
\l-l-l{+0-6f7 


, , 0-6686 -0-7210J 

.fl-4420.ft-— +terms in 

= 2-4420 + 1 ^ 0-6377 - 0-7333/! -f 0-26«“ 


l-.l-lt+0-6<2 

4- terms in 


(l-l-l« + 0-6««) 


(9) 


From this we have S Pi — 2-4420 and the ‘correlations’ Pj of the correlations are 0-8334, 0-4321, 

-t=— 05 

0-0006, .... Successive terms may be calculated using the relation 

P,-2-21P,_i4-2-2P,_,-1-1P,_3 + 0-26P,^ = 0 (i>0). (10) 

CO 

The calculation of S PJ, suggested by Bartlett, cem also be made by this method, but it is more 

its, ^ CO 

arduous, and the first few terms will give a good approximation. 


7. The same method can’ be used to calciiiate the appropriate number of degrees of freedom for 
testing the correlation between two linear autoregressive schemes. 


In general, if B{utUj) = p</crS B{ViVj) = plcr'^ and 


S u,v^ 
i*=«l 


{ { % % \ * 

vU"',?.’') 


then 


varr 


n n 

S S Pup'n 

n n ' 

S Pa S P\i 
i=l i=l 


For linear autoregressive schemes, pU — P%-h 


varf'v 


n -f (n - 1) pipi + . . . + p„_ip;_i 


~ S PiP'ih. 

i^ — 00 


/ ® 

Thus, provided n is large, r can be tested with n S p<pj degrees of freedom, and the calculation 

/ i=— CO 


of S PtP'i oan be made by the above method. 

i«. — <0 


8. Finally, it is worth noting that, for autoregressive schemes involving m observables, it is possible 
to extend this method by the use of wi, parameters to calculate the .correlations within and between the 
observables, provided that adequate estimates of the coeflBoients of the equations are available. In 
practice, however, the procedure will often be reversed, and estimates of the ooeffloients of the auto, 
regressive schemes will be obtained by equating the theoretical and observed correlations. 


REFERENOES 

Babtlbot, M. S. (1946). J.B. statist. Soc. Suppl. 8, 27. 

WoiJ), H. (1938). A Study in the Analysis of Stationary Time-series. Uppsala. 



368 


Miscellanea 


Approximate formulae for the percentage points of the incomplete beta function 

and of the x® distribution 

By D. HALTON THOMSON 

Valuable ‘Tables of percentage points of tlio Incomplete Beta Function’ have been published in Bio- 
metrika (Thompson, 1941 a) giving numerical values of percentage points at various probability levels 
between F = 0'996and P = O'OOS for degrees of freedom = Sgandra = 2p ranging up to 120, and with 
an accuracy of five significant figures. In the same volume, a ‘ Table of the percentage points of the y® 
distribution ’ was also published (Thompson, 1 94 1 6 ) for values at the same probability levels and degrees 
of freedom ranging up to = 100, and with an accuracy of six significant figures, thus supplementing the 
table of that function originally due to R. A. Fisher (Fisher & Yates, 1938). 

Cases arise in practice where the tails of the frequency distribution of a large population are of special 
interest, thus involving (in the case of the beta function) values of 2p larger than 120, with a small 2g;, 
or vice versa. Harmonic interpolation between 120 and infinity, however, leads to substantial errors, 
as is found when the values of the percentage points x are expressed in terms of their tail values {x or 
1 — .'b< 0-6). This Note shows that close approximations to such extreme values may be determined by 
using the table as an auxiliary table to extend the Beta Function Tables in conjunction with certain 
simple alternative formulae. Comparisons within the range of the published Beta Function Tables are 
made indicating the degree of accuracy within that range. The accuracy of these formulae beyond that 
range increases rapidly with increasing 2p and decreasing 2g (and vice versa), so that they can bo applied 
with confidence under such conditions. 

The ‘normalized’ form of the Incomplete Beta Function, in the usual notation, is 

in which, for a given P, 1 — ®{g, p) in the tables denotes the upper percentage point and x(p, q) the lower 
percentage point. 

It is known that, whenp is large and g is small compared withp, this form tends towards the Incomplete 
Gamma Function „ ,, 

r(q)]o 

where «:{p,q) = e~K This in turn may be transformed to the distribution by putting pt = [x5(P)]/2. 
For a given large p and small g, therefore, the percentage point in terms of y^ is given approximately by 

!r(2p,2g)Sexp|^-^^J, (2) 

where 2g = i) in the table. This expression gives the exact value of x, when 2g = 2, but for larger 2g 
the error, which is consistently negative, increases rapidly with increasing 2g unless 2p is very largo — 
much larger than 120. It is, therefore, of limited practical use. The following modifications wore in 
consequence evolved. 

Approximation A 

Consider the constant of integration in the original form (1) which, when expanded, is 

{p + q-l){p + q-2)...(p+l)p 

m 

Let the terms g— l,g — 2, l,0be averaged; the constant as a first approximation then beeomes 

{p+i(g-i)r 

m ' 

The numerator suggests that a more accurate approximation for x would be obtained by substituting 
P + i(g - 1) in place of p in (2), thus leading to 

a,(2p,2g)Scxp[-^H(^]. 


(A) 



Miscellanea 


369 

A comparison of the approximate values of a: obtained from (A) with the exact values in the Beta 
Function Tables, for all probability levels between P = 0-996 and P = 0-006, shows that: 

(a) The error is consistently positive, but much smaller than the negative error in (2) ; in other words 
the latter is slightly over-corrected. 

(b) Foragivenp/g and varyingP, the erroris nearly constantjit is smallest atP = 0-996 and increases 

gradually in the direction of P = 0-006. 

(o) For a given P, the error decreases rapidly with increasing 2p and/or decreasing 2g. 

(d) Provided that pjq is larger than 4, the value of x is within 0-6‘% of the exact tail value; if p/g is 
larger than 10, the error is within 0-1 % of that value. 


AppBoxiMAraoN B 

The exponent in (A) may be written 

2p + q-l 2q 2p-l-g-l 
2q 


The factor, in square brackets is equivalent to the first term in the luiown expansion of the form 

where n = (2p-l-2g— l)/2g, which converges rapidly when n is large; i.e. when 2p is large compared 
with 2g. The above exponent may therefore be written 


(2p + 2q-l\ 


which, when inserted in (A), leads to 






(B) 


whore k = [ 3 l,(P)]/( 2 g). 

A similar comparison with the Beta Function Tables, for the same range of probability levels, shows 
that! 


(а) Approximation (B) gives generally more accurate values than (A), except when 2g is very small, 
in which case they are nearly identical. 

(б) For a given pjq and varying P, the error is negligible in the vicinity of P = 0-26; it increases 
negatively in the direction of P = 0-996, and positively in the direction of P = 0-006, the largest errors 
occurring at this level. 

(c) For a given P, the error decreases rapidly with increasing 2p and/or decreasing 2g. 

(d) Provided that (2pfl(2q)^ is larger than about 160, the values of x are within 0-6 % of the exact 
tail value; this implies that if 2p is larger than about 160, this degree of accuracy is attained even when 
pjq is as low as unity. If (2p)V(2g)“ is larger than about 2000, the error is within 0-1 % of the exact value, 
which implies that if 2p is larger than about 1 20, this degree of accuracy is attained when p/g is as low as 4. 

It will be observed that, when 2g = 2, the formula does not revert exactly to (2), as is required by theory; 
but, unloss 2p is also quite small, the error in the computed value of x is negligible. 

The expansion of (B) leads to gp, 2g).S e-'' - a( 1 + ffl) d -f sv\ 


where 


A(F) 


and a = 


2p-(-2g-l 2p + 2g-l’ 

thus demonstrating its analogies with Campbell’s formula (C) below. 


Adaptation" of Oampbell’s formitijA 

In a book concerned primarily with quality control? Simon (1941) quotes (without the proof) a formiila, 
due to Campbell ( 1923) , designed to determine the average number of defectives in a sample of a, starting 
from the known average number in an infinite sample. It is a particular application of the general 
problem now under consideration, namely, the approximate determination of the percentage points of 



370 


Miscellanea 


the Beta Fimotion, starting from the corresponding known values for the form of the Poisson exponen- 
tial binomial summation. It is given in the following form: 


. = Av ,-^ + ^[1442 + {3a -f- 2) ^ 4- a] -b . . 

0(C, 00, P) 


(3) 


where o(c,n, P) = average number of defectives in which P is the probability of at least c defectives in 
asampleofw, o(c,oo,P) = average number of defectives in an infinite sample, -d = i(c-a~- 1), in which 
o = o(c, 00 , P). (Simon quotes a s= (a, oo, P), which is an evident misprint.) 

If G denotes the value given by the formula, then 

0 ( 0 , n,P) = o(c,oo,P)(l-f G), 

so that 1 + Q is the factor by which the average number of defectives in an infinite sample must be 
multiplied to give that in a sample of n. 

The change from Campbell’s notation to the more familiar general notation is given by 
o(c,n,P) = {l-*( 2 p, 23 )}n, a - a(c,oo,P) = [;\^,(P)]/2, 
where n = p+S" 1> and c — q. 

Let u = ajn and r = (c— l)/(2n), 

then A - n(r—u/2). 

By inserting this notation in (3) and rearranging the terms, the formula leads to 


a!(2p, 2g)~ 1- 




JU 


which expression includes the first four terms in the expansion of 6~'‘. 

Hence, for the determination of the percentage points *(2p, 2q), Campbell’s formula may, in effect, 
be re-written as 

/ 1 \ 

(C) 


where 


:i;(2p, 2}) S ^ 1 + ^ j w + iirip, 


and r = - 


2(p + g-l) 2(p-l-g-l) 

For large 2p and small 2j, the last two terms become negligible, in which case it reduces to 

x{ 2p, 2g) S — ?■( 1 -f |r) m. 


(CO 


Coohbait’s approximation 


Cochran (1940), extending a method of Fisher’s (1926), has introduced a useful approximation for 
the percentage points of the Incomplete Beta Fimction, when both p and q are large, his method being 
to determine a sufficiently accurate value of z, as used in Fisher’s ^-transformation. 

If y is the normal deviate at probability level P, then for a given pair of arguments 2p, 2g, the following 
are first calculated, using Hartley’s (1941) notation: 

A = i(y»-i-3). A = :^^, 

y (^-i)(A-2p) 

^J{A-X) pA 


Hence, by Fisher’s transformation, 


x{2p, 2q) ~ 


2p 

2p-t-2ge*®' 


(D) 


Comparison op pobmoiab 

Table 1 compares the various formulae for upper percentage points at an extreme probability level 
(P = 0-996). Table 2 indicates their relative accuracy on a common basis, namely, as a percentage of the 
exact value of a: or 1 --a), whichever is the smaller, so that the deviations from the exact values, when 
X OT I— X approach zero, are duly emphasized. For intermediate probability levels, the percentages lie 
between the tabulated extremes. It will he noted that in the case of approximation B the errors pass 
through zero near the mid-range of P; in the oases of A and C the errors are positive for all values of P. 



Miscellanea 


371 

The general conclusions from these tables and other comparisons are that, for a given probability 
level P: 

(a) When about 6, approximations A, B and C have about the same degree of accuracy, so that 
the simpler, A or B, have the advantage. 

(b) In the range 6 > p/g > 4, there is little to choose between B and C; but B is the simpler. 

(c) When p/2<4 and the distribution approaches symmetry, D gives the best results, provided that 
2p and 2g are moderately large, say > 60. It may be, however, that B in this range will be sufOciently 
accurate for many purposes; if p/g > 2, the maximum error of a; is about 2 units in the third decimal place. 


Table 1. Gomparison of approximate fornmlae at a gvven jorohahility level 

P = 0-995 


Zp 

2g 

x(Zp. Zq) 

A 

B 

0 

(Campbell) 

D 

(Cochran) 

Exact 

120 

2 

0-9‘16461 

0-9*16459 

0-9*16461 

0-999862 

0-9*16461 



Nil 

-0-0*00002 

Nil 

-0-000054 

— 


4 

0-9982908 

0-9982906 

0-9982907 

0-997926 

0-9982907 



+ 0-0000001 

-0-0000001 

Nil 

- 0-000365 

— 


10 

0-982764 

0-982766 

0-982760 

0-982076 

0-982769 



+ 0-000005 

-O-OOOOOi 

+ 0-000001 

-0-000683 

— 


20 

0-944002 

0-943893 

0-943941 

0-943366 

0-943930 



+ 0-00007Z 

-0-000037 

+ 0-000011 

-0-000564 

— 


30 

0-902230 

0-901839 

0-902000 

0-901551 

0-901960 



+ 0-000280 

-0-000111 

+ 0-000050 

-0-000399 

— 


40 

0-86160 

0-86070 

0-88106 

0-86066 

0-86093 



+ 0-00067 

-0-00023 

+ 0-00013 

-0-00027 

— 


60 

0-78782 

0-78622 

0-78621 

0-78668 

0-78579 



+ 0-00203 

-0-00057 

4- 0-00042 

-0-00011 

— 


120 

0-62698 

0-61430 

0-61866 

0-61620 

0-61620 



+ 0-00978 

-0-00190 

+ 0-00235 

Nil 



N.B. The figures in italics are the differences between the approximate and exact values. 


Table 2. Selative accwaoy of approximate formulae 


Error of x{2p, 2g) expressed as a percentage of the smaller exact tail value (x or 1 — a: < 0-6) 


2p 

2? 


A 

B 

C 

(Campbell) 

D 

(Cochran) 



P/5\ 

0-996 

0-600 



0-600 


0-996 



0-996 


0-006 




% 

o/ 

/o 

% 

% 

% 

% 

<v 

/o 

% 

o/ 

/o 

o/ 

/o 

% 

% 

120 

12 

10 

♦ 

« 

♦ 

mm 

■■ 

* 


* 


-2-8 


iVil 

20 

6 

-f 0-1 

4-0-2 

4-0-2 

BiSI 

mm 


* 

* 

■xn 

r-1-0 


IXQ 


40 

3 

+ 0-6 

4-0-6 

4-0-7 

HiS’l 




■ajn 


-0-2 




60 

2 

-f 0-9 

-M-1 

-f 




+ 0-2 



-0-1 




120 

1 

-f2-6 

4-2-7 

4-4-4 

-0-6 

U 


4-0-6 


4-1-7 




30 

3 

10 

♦ 


Bfl 


♦ 

* 




-30-0 

-1-4 

-7-1 

6 

6 

4-0-1 

RS8I 


BjS| 

-0-1 


* 

♦ 


-11-1 

— 0*6 

-1-8 


10 

3 

4-0-4 

4-0-5 

4-0-8 


-0-1 


BSu 


4-1-3 

-2-2 

-0-1 



16 

2 

4-0-8 

4-1-0 

4-1-9 


-0-1 

4-0-6 

ISpI 


4" 2‘i5 

-0-7 


BIfil 


30 

1 

4-2-4 

4-2-7 

4-7-1 


-0 1 

4-1-8 

m 

4-0-6 

4-8-2 





* Error smaller than ± 0-06 %. 
























372 


Miscellanea 


Wh.son-Hili'brty approximation fob ^^-adjustment 
This formula (Wilson & Hilferty, 1931) for the percentage points of the distribution is 

where V represents the degrees of freedom, and i/p the standardized normal deviate corresponding to 
probability level P. A table has been published in Biametrika (Morrington, 1941), comparing tho 
approximations derived from this formula with tho exact values, at various probability levels between 
F = 0-995 and P = 0-006. It shows the remarkable accuracy of the formula, the maximum errors 
varying from about ± 0-04, when v = 30, to about ± 0’024, when p =5 100. 

When these errors were plotted against the exact values on logarithmic paper, it was observed that 
for a given probability lovel, they varied inversely with ^Jv very closely. It follows that this square root 
relation may be used to adjust the Wilson-Hilferty formula, bringing the values computed therefrom 
still nearer to the exact values. 

If the difference (at v = 30) between the Wilson-Hilferty value and the exact value, when multiplied 
by 7(100/30), is treated as a coefficient 0 (which may be positive or negative), the required adjustinent 
for any value of r is given by 

Adjustment = 

'Pot various probability levels P, the values of C are given in the following table: 


p 

0 

P 

0 

0-996 

-f 0-233 

0-260 

-b 0-039 

0-990 

-b 0-157 

0-100 

-b 0-066 

0-976 

-b 0-067 

0-060 

-b 0-035 

0-950 

-b 0-011 

0-025 

-0-016 

0-900 

-0-029 

0-010 

-0-120 

0-760 

-0-046 

0-006 

-0-227 

0-600 

-0-013 





A test against the Merrington Table shows that this adjustment leads to values of bet-ween p = 30 
and V = 100 at all probability levels -wdth an accuracy of ± 0-001, i.e. to four or five significant figures. 
Since the Wilson-Hilferty approximation assumes a normal distribution about 1 — 2j(Qv), which tends 
to unity as v increases to infinity, and since the adjustment tends to zero under those conditions, it 
follows that the latter may also be safely applied for an indefinitely large v. 

It should be added that an adjustment on similar principles is not applicable to the Fisher approxi- 
mation for x'‘- 

REFERENCES 

Campbbi.1,, G. A, (1923). Bell Syst. Tech. J. January. 

Cochran, W. G. (1940). Note on an approximate formula for significance levels of g, Ann. Math. 
Statist. 11, 93. 

Fishkr, R. a. (1925). Statistical Methods for Besearch Workers. Edinburgh: Oliver and Boyd. 
1st edition. 

Fisher, R. A. & Yates, F. (1938), Staiistical Tables for Biological Agricultural and Medical Besearch. 
Edinburgh: Oliver and Boyd. 

Hartley, H. 0. (1941). T’able.s of percentage points of the Incomplete Beta Function. Methods of 
interpolation. Biametrika, 32, 166. 

Merrington, M. (1941). Numerical approximations to the percentage points of the x^ distribution. 
Biometrika, 32, 200. 

Simon, L, B. (1941). An Engineer's Manual of Statistical Methods, p. 185. New York: John Wiley 
and Sons, Inc. 

Thompson, C. M. (1941a, h). Tables of percentage points of the Incomplete Beta Function, Biometrika, 
32, 161. 

Thompson, C. M. (1941). Table of percentage points of the x® distribution. Biometrika, 32, 187. 
Wilson, E. B. & Hilfbbty, M. M. (1931), The distribution of chi-square. Proc. Nat. Acad. Sci. 
Wash. 17, 684. 




[ 373 J 


REVIEWS 

A First Course in Mathematical Statistics. By C. B. WBATHBEBtJRlir. Cambridge 
University Press. Price 16 s. 

An outstanding feature of the present statistical time is the number of text-books which are being 
written, and each one from a slightly diiferent point of view. It is this which makes statistical theory 
interesting to study, for there can be no rigid approach to a subject which is used and expounded by so 
many and diverse persons. Professor Weatherbum has taken a rather formal mathematical exposition 
of the subject, and mathematical students will find his book both interesting and profitable to read. 
Numerical examples are given for the reader to apply the appropriate mathematical technique. It is 
possible that these would have been of greater utility if they had contained the material in its crude state, 
and had not been streamlined so that the application of the technique is immediately obvious, but 
nevertheless many new examples are there. 

1 am not sure whether this book will be entirely useful to students of other subjects than mathematics. 
While the mathematical analysis is undoubtedly clear it is possible that many will not be able to follow 
it in detail, and the conclusions of the analysis are not emphasized strongly. We may contrast with this 
Fi.sher’s Statistical Methods for Research Workers, where no analysis is given, but where the relevant 
formulae and their interpretation are stated unmistakeably and their applications to material in its 
crude state set out so that the student may calculate for himself. 

Probability theory is the foundation stone on which the whole of statistical theory is built. It is dis- 
appointing therefore to find that it is given somewhat perfunctory treatment in one chapter and the part 
it plays in (say) statistical tests of significance is not brought out and emphasized. There is a tendency 
nowadays in applying statistical technique to regard the 6 % and 1 % levels of significance as sacrosanct 
and those coming fresh to the subject should learn that custom is the only reason for their choice. 

In spite of the criticisms which I make, however, I would recommend this book to students who have 
obtained some idea of the aims and objectives of statistical theory, and who are desirous of learning the 
development of the mathematical technique as well as its application. Professor Weatherbum’s mathe- 
matical analysis makes pleasant reading and may well throw new light on old methods for those who 
have learnt the rudiments of the theory. 

F. N. DAVID 


Advances in Genetics, Volume 1. New York, N.Y. : Academic Press, 1947. 

This is the first number of a new periodical, probably an.annual, summarizing recent work in various 
fields. Of the nine articles, ranging from 12 to 96 pages, with mean 42-6, s.d. 7'89, and a positively skew 
distribution, perhaps the most interesting to European geneticists will be that on the genetics of the 
ciliate Protozoa, Paramecium and Euploles. Here Sonneborn describes work almost entirely done in 
America, with very surprising results. Thus Paramecium aurelia consists of at least seven endogamous 
variotie,s, each with two exogamous mating types, which might be called sexes were it not that in 
P. bursaria one of the varieties has no less than eight mating typos. 

Shrode and Lush’s article of the genetics of cattle gives a very condensed account of the large amount 
of work which has been done on the inheritance of economically important characters such as milk yield 
and growth rate. For example cattle biomotrioians have used the important concept of ‘heritability’, 
moaning tho fraction of the variance of a character duo to additive genetic differences. Within a herd 
this rarely exceeds 30 %. More space is devoted to work on the genetics of colour and the like, which is 
of far less economic importance, and the review of progeny testing methods is disappointingly brief. 
However, the bibliographical references will be useful. Similarly, Atwood’s article on forage crops, though 
most valuable as a guide to the literature, does not give a detailed account of any of the biometric work 
which has been done on grasses and clovers. 

Only two of the papers give data whieh a biometrician could immediately utilize. These are Gordon’s 
account of polymorphism in fish populations, and Spencer’s of mutations in wild Drosophila species, 
which unfortunately does not include some valuable recent Italian and Russian work. Gordon s results 
call for the development of methods of estimating gene frequency similar to those used with human blood 
groups, Spencer is mainly concerned with results, but these are often given in sufficient detail to interest 
biometricians, though no attempt is made to summarize Wright’s fundamental statistical theory. 



374 


Reviews 


The other articles will be less attractive to biometricians, though it is of interest to see how statistical 
methods are demanded by the mere fact that the genus Grepis, whose evolution is reviewed by Babcock, 
includes 196 species, most of which have beeh examined cytologically, and between which 130 of the 
38,220 possible crosses have been made. 

The volume will be indispensable to geneticists. Biometricians certainly cannot neglect it. 

j. n. s. H. 

Mathematical Methods of Statistics. By H. Cramer, Princeton University Press. 1946. 

$ 6 . 00 . 

This book was written by Prof. Cramer during the war and has been published first in Sweden and then 
by an offset process by the Princeton University Press in the U.S.A. It is a definitive exposition of the 
theory of mathematical statistics as it existed in 1940 (about) and it is worth while therefore to consider 
its contents in some detail. Prof. Cramer has divided his exposition into three parts; the first part i.s 
purely mathematical. The theory of seta and of such Lebesgue measure as is necessary for the under- 
standing of the .second part is developed first of all. Such a development will be useful for the student 
of mathematical statistics coming fresh to the theory of measure in that he receives guidance as to what 
are the elements essential for him to imderstand. Chapters 11 and 12 on matrices determinants and 
quadratic forms and misoellanaous complements do not fit into this general scheme but have obviously 
been included bore as part of the mathematical equipment necessary for the student. Possibly Chapter 10 
on Fourier Integrals would have fitted more naturally into Part II but this is a matter of taste. 

Part II begins with a formal development of the theory of probability as given by the French and 
Eussian schools of probability, and which Prof. Cramer has already given in his Cambridge tract ‘ Random 
Variables and Probability Distributions’. The treatment hero seems .simpler, however, than in hi.s earlier 
tract and there is a more practical flavour to his exposition. This part while still purely mathematical 
begins to introduce distributions and ideas which are familiar to the statistician. 

The title of the third part is ‘ Statistical Inference’ and the main outline is that of small sample theory 
^developed during the past twenty-five years. The illustrations aro ninnerical as well as mathematical 
and an attempt is made to show the student the numerical applications of the processes through which 
his mathematical theory leads him. The treatment is not exhaustive but the student who has assimilntod 
this part will have little difficulty in extending his Imowledgo by fm'thor reading. 

As a textbook of mathematical statistics this book will remain mirivallod for many years to come. The 
mathematical exposition is clear, the development of ideas logical throughout, and the theorems are 
presented in a very general way. Any student of mathematics who wishes to get a picture of what statis- 
tical theory is about will be led inevitably to a study of this book. To those who wish to become statisticians 
it will be necessary to supplement the reading by a practical course in which the mathematical tools 
are tried out on numerical examples. This aspect of statistical work the book does not cover, hut it is 
obvious that this would he the case from the title. It only remains to say to the student ‘ This is a good 
book, buy it ’. F. N. david 


CORRIGENDA 

{Biometrika, 34, 176-7) 

In J. Wishart’a paper on ‘The cumulants of the z and of the logarithmic and t distributions’, 
the following correction should be made ; 

p. 176, 1st line of .section 3: read ‘log 1 1 \ ' for 'log t’, in two places, 
p. 177, 1st line following equation (32): read ‘log \ x\ ' for ‘log x’. 




[Atl Rights reserved) 


BIOMETRIKA. Vol. XXXIV, Parts III and IV 


CONTENTS 

On the distribution of the ranis correlation eoeffioienfc t when the variates aj'e not independent. 

By WAsainy Hoffdjno _ • • • , • 

The sigaifloance of rank correlations where parental correlation exists, By H. E. Danxb)I,8 

and M. G. ICendaia. 

Testing for normality. By B. C, Qbahy ...•••••-• 
The stratified semi .stationary population. By S- Vajda . . . . • 

A simple approach to confounding and fractional replication in factorial experimonts. By 0. 

Kekpthobnk • ■ • • • • • 

A comparison of stratified with unrestricted random sampling from a finite population. By 

P. ARmxAQB . . . 

.Somo tbooi'cms on time scries. I. By P. A. P. Moban . . 

Bank correlation between two variables, one of which is ranked, the other dichotomous. By 
J. W. Whixfield 

The variance of t when both rankings contain ties. By M. Q. Kendaia. .... 
A X® ‘smooth’ test for goodness of fit. By F. N. David . . . 

An exact test for the equality of variances. By B. L. PnAOKEXX ...... 

The estimation from individual records of the r0latio3a8hip between dose and quantal response. 

By D, J. PlNMBY . . . . . . • • • • , • 

A power function for tests of randomness in a sequence of alternatives. By P. N. David 
A numerical solution of the problem of moments. By H. O. Habtlsy and S. H. Khamis. 
Apjiroxiiruition to (lercenLuge points of the z-distribution. By A. H. Caexbb . . 

MdSOEIIiANIlA 

Note on the ovimulants of Fisher’s z-distribution. By Lbo A. Aroian . . 

A note on the mean deviation from the median. By K,. B. Nair . 

On the method of pnivccl comparisons. By P. A. P. Mohan . . - . . 

Notes on the nalcultii ion of autocorrelations of linear autoregressive schemes. By M. H. 

QtrBNOtraxi! . • • • ■ 

Approximate formulae for the percentage points of the incomplete beta function and of 

the X* distribution. By D. Hadxoh Thomson 

Bevibws 

AMratOowseinMathematioalStatiatica . . 

Achancea in Qanetica . . . . . . . . . . . . 

MatJtematical Methods of Statistica . . . . . . . . . . 


XAOX 

183-106 

197-208 

209-242 

243-264 

266-272 

273-280 

281-291 

292-296 

297-298 

299-310 

311-319 

320-334 

336-339 

340-3: 

362-3{..-, 


369-360 

360-362 

363-366 

366-367 


368-3.* 


373 

373 

374 


A Tolume of Siometriia coatains about 400 pages, with plates and tables, and it is hoped that in future this will be 
published annually in two half-yearly issues. 

Papers for i-uV;. .■■i. HiM ^ h ■■;i; to 

I’I’.Ofll.'i.'Oi! !■!. -! DepartmentofStatistics, University College, London, W.O. 1, 

or if more convenient may be submitted through a member of the Editorial Committee, viz. 

PBorassoB HabAm OBAMfiB, University of Stockholm, Sweden. 

Dr B, C. Gbaby, Statistkis Branch, Department of Industry and Conuneroe, Dublin. 

Profsssob M. Qbebbwood, P.R.S., London School of Hygiene and Tropical Mediotne, London, W.C. 1. 

Pbofrssob X B. S. Haidaot, P.B.S., University College, London, W.C. 1, 

Db Q. M, Mobast, R.A.F. Institute of Aviation Medioine, R.A.F. Station, Faruborough, Hants. 

Db John WiSJiABT, School of Ag ilture, Cambridge, 

It is a oondition of publication in BimetHka that the paper shall not already have been issued elsewhere, and will not be 
reprinted without leave of the Editors. 

Contributors rooeive 26 copies of their papers free. Joint authors 16 copiee each. 

TIlo snlisuripiKia price, /juiyaiiip, !h ndvmei, ie Inland 46s. net per volume and Abroad 64s. net (including packing and 
postii.ge). Cwiiig to the sc.trciLy of early volnmes, the following rates must now be ohatged for complete sets. Vols.l — XXXIV, 
inebid'mg XX®; £120, 6s. in wrappers, not including postage. At present oertain volumes are out of print, but steps aie.being 
taken t» re-iss\ic tbe.»c, as quickly as printing facilities permit. Recent volumes may still be obtained at the wrapper price; 
this is 64s. inland, irtoluding postage. Index to Vpls. I to V, 2s. net. Index to Vols. I to XV, 6s. nel. Cheques must be 
made payable to BioinehUj. etoiseii “u/c Biometriha Tmat” and sent to The Secretary, Biometrifxt Office, Department of 
Suitisiii;:-., LiiLviufity College, l.uiirion, W.C. 1, to whom all orders for series, single copies and offprints should be addressed. 
All fovtign Liiequcs must be drawn in sterling and on a Bank having a London Agency. 

First printed in Great Britain at the VnweraUy Preae, Omribridge 
Seprinied bi/ offiaet-Utho bit Perm/ Lmd Humohriai Ha.: t.iA 




